Document (#39920)

Author
Rayson, P.
Piao, S.
Sharoff, S.
Evert, S.
Moiron, B.V.
Title
Multiword expressions : hard going or plain sailing?
Source
Language resources and evaluation. 44(2010) no.1, S.1-5
Year
2015
Abstract
Over the past two decades or so, Multi-Word Expressions (MWEs; also called Multi-word Units) have been an increasingly important concern for Computational Linguistics and Natural Language Processing (NLP). The term MWE has been used to refer to various types of linguistic units and expressions, including idioms, noun compounds, phrasal verbs, light verbs and other habitual collocations. However, while there is no universally agreed definition for MWE as yet, most researchers use the term to refer to those frequently occurring phrasal units which are subject to certain level of semantic opaqueness, or non-compositionality. Non-compositional MWEs pose tough challenges for automatic analysis because their interpretation cannot be achieved by directly combining the semantics of their constituents, thereby causing the "pain in the neck of NLP".
Theme
Computerlinguistik

Similar documents (content)

  1. Cruys, T. van de; Moirón, B.V.: Semantics-based multiword expression extraction (2007) 0.29
    0.29452464 = sum of:
      0.29452464 = product of:
        1.2271861 = sum of:
          0.01870723 = weight(abstract_txt:been in 4920) [ClassicSimilarity], result of:
            0.01870723 = score(doc=4920,freq=1.0), product of:
              0.05486916 = queryWeight, product of:
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.015087538 = queryNorm
              0.34094253 = fieldWeight in 4920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
          0.16062789 = weight(abstract_txt:noun in 4920) [ClassicSimilarity], result of:
            0.16062789 = score(doc=4920,freq=3.0), product of:
              0.1266151 = queryWeight, product of:
                1.0741467 = boost
                7.8127427 = idf(docFreq=46, maxDocs=42740)
                0.015087538 = queryNorm
              1.2686313 = fieldWeight in 4920, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.8127427 = idf(docFreq=46, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
          0.12448725 = weight(abstract_txt:multiword in 4920) [ClassicSimilarity], result of:
            0.12448725 = score(doc=4920,freq=1.0), product of:
              0.15407372 = queryWeight, product of:
                1.1849093 = boost
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.015087538 = queryNorm
              0.807972 = fieldWeight in 4920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
          0.25232407 = weight(abstract_txt:compositionality in 4920) [ClassicSimilarity], result of:
            0.25232407 = score(doc=4920,freq=2.0), product of:
              0.19585791 = queryWeight, product of:
                1.3359537 = boost
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.015087538 = queryNorm
              1.2883017 = fieldWeight in 4920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
          0.4841281 = weight(abstract_txt:mwes in 4920) [ClassicSimilarity], result of:
            0.4841281 = score(doc=4920,freq=2.0), product of:
              0.38102385 = queryWeight, product of:
                2.63519 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.015087538 = queryNorm
              1.2705978 = fieldWeight in 4920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
          0.18691155 = weight(abstract_txt:expressions in 4920) [ClassicSimilarity], result of:
            0.18691155 = score(doc=4920,freq=1.0), product of:
              0.29136887 = queryWeight, product of:
                2.8222992 = boost
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.015087538 = queryNorm
              0.6414946 = fieldWeight in 4920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.09375 = fieldNorm(doc=4920)
        0.24 = coord(6/25)
    
  2. Dias, G.: Multiword unit hybrid extraction (o.J.) 0.22
    0.21983549 = sum of:
      0.21983549 = product of:
        0.91598123 = sum of:
          0.015589358 = weight(abstract_txt:been in 2644) [ClassicSimilarity], result of:
            0.015589358 = score(doc=2644,freq=1.0), product of:
              0.05486916 = queryWeight, product of:
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.015087538 = queryNorm
              0.28411877 = fieldWeight in 2644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
          0.17968185 = weight(abstract_txt:multiword in 2644) [ClassicSimilarity], result of:
            0.17968185 = score(doc=2644,freq=3.0), product of:
              0.15407372 = queryWeight, product of:
                1.1849093 = boost
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.015087538 = queryNorm
              1.1662071 = fieldWeight in 2644, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
          0.052591514 = weight(abstract_txt:word in 2644) [ClassicSimilarity], result of:
            0.052591514 = score(doc=2644,freq=1.0), product of:
              0.1234203 = queryWeight, product of:
                1.4997854 = boost
                5.4543004 = idf(docFreq=496, maxDocs=42740)
                0.015087538 = queryNorm
              0.4261172 = fieldWeight in 2644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4543004 = idf(docFreq=496, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
          0.17852043 = weight(abstract_txt:verbs in 2644) [ClassicSimilarity], result of:
            0.17852043 = score(doc=2644,freq=1.0), product of:
              0.27876276 = queryWeight, product of:
                2.2539964 = boost
                8.197155 = idf(docFreq=31, maxDocs=42740)
                0.015087538 = queryNorm
              0.64040273 = fieldWeight in 2644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.197155 = idf(docFreq=31, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
          0.2973668 = weight(abstract_txt:phrasal in 2644) [ClassicSimilarity], result of:
            0.2973668 = score(doc=2644,freq=1.0), product of:
              0.39171582 = queryWeight, product of:
                2.6719074 = boost
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.015087538 = queryNorm
              0.75913906 = fieldWeight in 2644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
          0.19223131 = weight(abstract_txt:units in 2644) [ClassicSimilarity], result of:
            0.19223131 = score(doc=2644,freq=2.0), product of:
              0.26608026 = queryWeight, product of:
                2.6970425 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.015087538 = queryNorm
              0.7224561 = fieldWeight in 2644, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.078125 = fieldNorm(doc=2644)
        0.24 = coord(6/25)
    
  3. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.20
    0.20178756 = sum of:
      0.20178756 = product of:
        1.2611723 = sum of:
          0.12448725 = weight(abstract_txt:multiword in 4921) [ClassicSimilarity], result of:
            0.12448725 = score(doc=4921,freq=1.0), product of:
              0.15407372 = queryWeight, product of:
                1.1849093 = boost
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.015087538 = queryNorm
              0.807972 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.09375 = fieldNorm(doc=4921)
          0.35684013 = weight(abstract_txt:compositionality in 4921) [ClassicSimilarity], result of:
            0.35684013 = score(doc=4921,freq=4.0), product of:
              0.19585791 = queryWeight, product of:
                1.3359537 = boost
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.015087538 = queryNorm
              1.8219337 = fieldWeight in 4921, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.71698 = idf(docFreq=6, maxDocs=42740)
                0.09375 = fieldNorm(doc=4921)
          0.59293336 = weight(abstract_txt:mwes in 4921) [ClassicSimilarity], result of:
            0.59293336 = score(doc=4921,freq=3.0), product of:
              0.38102385 = queryWeight, product of:
                2.63519 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.015087538 = queryNorm
              1.5561581 = fieldWeight in 4921, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.09375 = fieldNorm(doc=4921)
          0.18691155 = weight(abstract_txt:expressions in 4921) [ClassicSimilarity], result of:
            0.18691155 = score(doc=4921,freq=1.0), product of:
              0.29136887 = queryWeight, product of:
                2.8222992 = boost
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.015087538 = queryNorm
              0.6414946 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.09375 = fieldNorm(doc=4921)
        0.16 = coord(4/25)
    
  4. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.20
    0.19813 = sum of:
      0.19813 = product of:
        0.8255417 = sum of:
          0.038641065 = weight(abstract_txt:noun in 3537) [ClassicSimilarity], result of:
            0.038641065 = score(doc=3537,freq=1.0), product of:
              0.1266151 = queryWeight, product of:
                1.0741467 = boost
                7.8127427 = idf(docFreq=46, maxDocs=42740)
                0.015087538 = queryNorm
              0.30518526 = fieldWeight in 3537, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8127427 = idf(docFreq=46, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
          0.121239625 = weight(abstract_txt:compounds in 3537) [ClassicSimilarity], result of:
            0.121239625 = score(doc=3537,freq=6.0), product of:
              0.14933631 = queryWeight, product of:
                1.1665505 = boost
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.015087538 = queryNorm
              0.8118563 = fieldWeight in 3537, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
          0.17203231 = weight(abstract_txt:multiword in 3537) [ClassicSimilarity], result of:
            0.17203231 = score(doc=3537,freq=11.0), product of:
              0.15407372 = queryWeight, product of:
                1.1849093 = boost
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.015087538 = queryNorm
              1.1165584 = fieldWeight in 3537, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
          0.026295757 = weight(abstract_txt:word in 3537) [ClassicSimilarity], result of:
            0.026295757 = score(doc=3537,freq=1.0), product of:
              0.1234203 = queryWeight, product of:
                1.4997854 = boost
                5.4543004 = idf(docFreq=496, maxDocs=42740)
                0.015087538 = queryNorm
              0.2130586 = fieldWeight in 3537, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4543004 = idf(docFreq=496, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
          0.24705558 = weight(abstract_txt:mwes in 3537) [ClassicSimilarity], result of:
            0.24705558 = score(doc=3537,freq=3.0), product of:
              0.38102385 = queryWeight, product of:
                2.63519 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.015087538 = queryNorm
              0.64839923 = fieldWeight in 3537, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
          0.22027738 = weight(abstract_txt:expressions in 3537) [ClassicSimilarity], result of:
            0.22027738 = score(doc=3537,freq=8.0), product of:
              0.29136887 = queryWeight, product of:
                2.8222992 = boost
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.015087538 = queryNorm
              0.7560086 = fieldWeight in 3537, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3537)
        0.24 = coord(6/25)
    
  5. Ramisch, C.; Villavicencio, A.; Kordoni, V.: Introduction to the special issue on multiword expressions : from theory to practice and use (2013) 0.13
    0.1267958 = sum of:
      0.1267958 = product of:
        0.79247373 = sum of:
          0.022046681 = weight(abstract_txt:been in 3125) [ClassicSimilarity], result of:
            0.022046681 = score(doc=3125,freq=2.0), product of:
              0.05486916 = queryWeight, product of:
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.015087538 = queryNorm
              0.40180463 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6367204 = idf(docFreq=3059, maxDocs=42740)
                0.078125 = fieldNorm(doc=3125)
          0.14670964 = weight(abstract_txt:multiword in 3125) [ClassicSimilarity], result of:
            0.14670964 = score(doc=3125,freq=2.0), product of:
              0.15407372 = queryWeight, product of:
                1.1849093 = boost
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.015087538 = queryNorm
              0.95220417 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.618368 = idf(docFreq=20, maxDocs=42740)
                0.078125 = fieldNorm(doc=3125)
          0.40344003 = weight(abstract_txt:mwes in 3125) [ClassicSimilarity], result of:
            0.40344003 = score(doc=3125,freq=2.0), product of:
              0.38102385 = queryWeight, product of:
                2.63519 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.015087538 = queryNorm
              1.0588315 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.078125 = fieldNorm(doc=3125)
          0.22027738 = weight(abstract_txt:expressions in 3125) [ClassicSimilarity], result of:
            0.22027738 = score(doc=3125,freq=2.0), product of:
              0.29136887 = queryWeight, product of:
                2.8222992 = boost
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.015087538 = queryNorm
              0.7560086 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.842609 = idf(docFreq=123, maxDocs=42740)
                0.078125 = fieldNorm(doc=3125)
        0.16 = coord(4/25)