Search (1 results, page 1 of 1)

  • × author_ss:"Moirón, B.V."
  • × theme_ss:"Computerlinguistik"
  • × year_i:[2000 TO 2010}
  1. Cruys, T. van de; Moirón, B.V.: Semantics-based multiword expression extraction (2007) 0.02
    0.017565278 = product of:
      0.052695833 = sum of:
        0.052695833 = weight(_text_:resources in 2919) [ClassicSimilarity], result of:
          0.052695833 = score(doc=2919,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.28231642 = fieldWeight in 2919, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2919)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper describes a fully unsupervised and automated method for large-scale extraction of multiword expressions (MWEs) from large corpora. The method aims at capturing the non-compositionality of MWEs; the intuition is that a noun within a MWE cannot easily be replaced by a semantically similar noun. To implement this intuition, a noun clustering is automatically extracted (using distributional similarity measures), which gives us clusters of semantically related nouns. Next, a number of statistical measures - based on selectional preferences - is developed that formalize the intuition of non-compositionality. Our approach has been tested on Dutch, and automatically evaluated using Dutch lexical resources.