Cruys, T. van de; Moirón, B.V.: Semantics-based multiword expression extraction (2007)
0.00
0.0022179869 = product of:
0.0044359737 = sum of:
0.0044359737 = product of:
0.013307921 = sum of:
0.013307921 = weight(_text_:a in 2919) [ClassicSimilarity], result of:
0.013307921 = score(doc=2919,freq=16.0), product of:
0.052761257 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.045758117 = queryNorm
0.25222903 = fieldWeight in 2919, product of:
4.0 = tf(freq=16.0), with freq of:
16.0 = termFreq=16.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=2919)
0.33333334 = coord(1/3)
0.5 = coord(1/2)
- Abstract
- This paper describes a fully unsupervised and automated method for large-scale extraction of multiword expressions (MWEs) from large corpora. The method aims at capturing the non-compositionality of MWEs; the intuition is that a noun within a MWE cannot easily be replaced by a semantically similar noun. To implement this intuition, a noun clustering is automatically extracted (using distributional similarity measures), which gives us clusters of semantically related nouns. Next, a number of statistical measures - based on selectional preferences - is developed that formalize the intuition of non-compositionality. Our approach has been tested on Dutch, and automatically evaluated using Dutch lexical resources.
- Source
- Proceedings of the Workshop on A Broader Perspective on Multiword Expressions, Prag 2007
- Type
- a