Document (#38651)

Author
Ramisch, C.
Title
Multiword expressions acquisition : a generic and open framework
Imprint
Cham : Springer International Publishing
Year
2015
Pages
XIV, 230 S
Isbn
978-3-319-09206-5
Series
Theory and applications of natural language processing
Abstract
This book is an excellent introduction to multiword expressions. It provides a unique, comprehensive and up-to-date overview of this exciting topic in computational linguistics. The first part describes the diversity and richness of multiword expressions, including many examples in several languages. These constructions are not only complex and arbitrary, but also much more frequent than one would guess, making them a real nightmare for natural language processing applications. The second part introduces a new generic framework for automatic acquisition of multiword expressions from texts. Furthermore, it describes the accompanying free software tool, the mwetoolkit, which comes in handy when looking for expressions in texts (regardless of the language). Evaluation is greatly emphasized, underlining the fact that results depend on parameters like corpus size, language, MWE type, etc. The last part contains solid experimental results and evaluates the mwetoolkit, demonstrating its usefulness for computer-assisted lexicography and machine translation. This is the first book to cover the whole pipeline of multiword expression acquisition in a single volume. It is addresses the needs of students and researchers in computational and theoretical linguistics, cognitive sciences, artificial intelligence and computer science. Its good balance between computational and linguistic views make it the perfect starting point for anyone interested in multiword expressions, language and text processing in general.
Content
1.Introduction.- Part I.Multiword Expressions: a Tough Nut to Crack.- 2.Definitions and Characteristics.- 3 State of the Art in MWE Processing.- Part II.MWE Acquisition.- 4.Evaluation of MWE Acquisition.- 5.A New Framework for MWE Acquisition.- Part III Applications.- 6.Application 1: Lexicography.- 7.Application 2: Machine Translation.- 8.Conclusions.- Appendixes.- A.Extended List of Translation Examples.- B.Resources Used in the Experiments.- C.The mwetoolkit: Documentation.- D.Tagsets for POS and syntax.- E.Detailed Lexicon Descriptions.
Footnote
Bemerkung im Katalog der DNB: "Gehoert eindeutig nicht zum Sammelgebiet der Deutschen Nationalbibliothek"
Theme
Computerlinguistik
GHBS
BFP (FH K)

Similar documents (content)

  1. Ramisch, C.; Villavicencio, A.; Kordoni, V.: Introduction to the special issue on multiword expressions : from theory to practice and use (2013) 0.37
    0.37454832 = sum of:
      0.37454832 = product of:
        1.3376726 = sum of:
          0.018582592 = weight(abstract_txt:first in 3125) [ClassicSimilarity], result of:
            0.018582592 = score(doc=3125,freq=1.0), product of:
              0.056393173 = queryWeight, product of:
                1.0980529 = boost
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.01217625 = queryNorm
              0.32951847 = fieldWeight in 3125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.030005252 = weight(abstract_txt:processing in 3125) [ClassicSimilarity], result of:
            0.030005252 = score(doc=3125,freq=1.0), product of:
              0.07761647 = queryWeight, product of:
                1.2882107 = boost
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.01217625 = queryNorm
              0.38658357 = fieldWeight in 3125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.10420568 = weight(abstract_txt:linguistics in 3125) [ClassicSimilarity], result of:
            0.10420568 = score(doc=3125,freq=2.0), product of:
              0.1412776 = queryWeight, product of:
                1.7379874 = boost
                6.6759505 = idf(docFreq=144, maxDocs=42306)
                0.01217625 = queryNorm
              0.73759526 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6759505 = idf(docFreq=144, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.036642347 = weight(abstract_txt:language in 3125) [ClassicSimilarity], result of:
            0.036642347 = score(doc=3125,freq=1.0), product of:
              0.111726075 = queryWeight, product of:
                2.1857588 = boost
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.01217625 = queryNorm
              0.32796595 = fieldWeight in 3125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.14052312 = weight(abstract_txt:computational in 3125) [ClassicSimilarity], result of:
            0.14052312 = score(doc=3125,freq=2.0), product of:
              0.19739735 = queryWeight, product of:
                2.5160904 = boost
                6.443198 = idf(docFreq=182, maxDocs=42306)
                0.01217625 = queryNorm
              0.7118795 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.443198 = idf(docFreq=182, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.33751324 = weight(abstract_txt:expressions in 3125) [ClassicSimilarity], result of:
            0.33751324 = score(doc=3125,freq=2.0), product of:
              0.44604635 = queryWeight, product of:
                5.348852 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01217625 = queryNorm
              0.7566775 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
          0.67020035 = weight(abstract_txt:multiword in 3125) [ClassicSimilarity], result of:
            0.67020035 = score(doc=3125,freq=2.0), product of:
              0.70467556 = queryWeight, product of:
                6.7230325 = boost
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.01217625 = queryNorm
              0.9510765 = fieldWeight in 3125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.078125 = fieldNorm(doc=3125)
        0.28 = coord(7/25)
    
  2. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.37
    0.3712535 = sum of:
      0.3712535 = product of:
        1.3259053 = sum of:
          0.0131398775 = weight(abstract_txt:first in 3537) [ClassicSimilarity], result of:
            0.0131398775 = score(doc=3537,freq=2.0), product of:
              0.056393173 = queryWeight, product of:
                1.0980529 = boost
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.01217625 = queryNorm
              0.23300475 = fieldWeight in 3537, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.08688269 = weight(abstract_txt:constructions in 3537) [ClassicSimilarity], result of:
            0.08688269 = score(doc=3537,freq=5.0), product of:
              0.11617995 = queryWeight, product of:
                1.11445 = boost
                8.561642 = idf(docFreq=21, maxDocs=42306)
                0.01217625 = queryNorm
              0.7478286 = fieldWeight in 3537, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.561642 = idf(docFreq=21, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.015002626 = weight(abstract_txt:processing in 3537) [ClassicSimilarity], result of:
            0.015002626 = score(doc=3537,freq=1.0), product of:
              0.07761647 = queryWeight, product of:
                1.2882107 = boost
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.01217625 = queryNorm
              0.19329178 = fieldWeight in 3537, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.050845087 = weight(abstract_txt:texts in 3537) [ClassicSimilarity], result of:
            0.050845087 = score(doc=3537,freq=5.0), product of:
              0.10241219 = queryWeight, product of:
                1.4797413 = boost
                5.6839767 = idf(docFreq=390, maxDocs=42306)
                0.01217625 = queryNorm
              0.49647495 = fieldWeight in 3537, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.6839767 = idf(docFreq=390, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.036642347 = weight(abstract_txt:language in 3537) [ClassicSimilarity], result of:
            0.036642347 = score(doc=3537,freq=4.0), product of:
              0.111726075 = queryWeight, product of:
                2.1857588 = boost
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.01217625 = queryNorm
              0.32796595 = fieldWeight in 3537, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.33751324 = weight(abstract_txt:expressions in 3537) [ClassicSimilarity], result of:
            0.33751324 = score(doc=3537,freq=8.0), product of:
              0.44604635 = queryWeight, product of:
                5.348852 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01217625 = queryNorm
              0.7566775 = fieldWeight in 3537, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
          0.78587955 = weight(abstract_txt:multiword in 3537) [ClassicSimilarity], result of:
            0.78587955 = score(doc=3537,freq=11.0), product of:
              0.70467556 = queryWeight, product of:
                6.7230325 = boost
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.01217625 = queryNorm
              1.115236 = fieldWeight in 3537, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.0390625 = fieldNorm(doc=3537)
        0.28 = coord(7/25)
    
  3. Helbig, H.: Knowledge representation and the semantics of natural language (2014) 0.23
    0.23174988 = sum of:
      0.23174988 = product of:
        0.64374965 = sum of:
          0.015882455 = weight(abstract_txt:computer in 4397) [ClassicSimilarity], result of:
            0.015882455 = score(doc=4397,freq=1.0), product of:
              0.05893511 = queryWeight, product of:
                1.1225276 = boost
                4.3118486 = idf(docFreq=1541, maxDocs=42306)
                0.01217625 = queryNorm
              0.26949054 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3118486 = idf(docFreq=1541, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.02287402 = weight(abstract_txt:book in 4397) [ClassicSimilarity], result of:
            0.02287402 = score(doc=4397,freq=1.0), product of:
              0.075160675 = queryWeight, product of:
                1.2676674 = boost
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.01217625 = queryNorm
              0.30433494 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.024004202 = weight(abstract_txt:processing in 4397) [ClassicSimilarity], result of:
            0.024004202 = score(doc=4397,freq=1.0), product of:
              0.07761647 = queryWeight, product of:
                1.2882107 = boost
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.01217625 = queryNorm
              0.30926687 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.03638178 = weight(abstract_txt:texts in 4397) [ClassicSimilarity], result of:
            0.03638178 = score(doc=4397,freq=1.0), product of:
              0.10241219 = queryWeight, product of:
                1.4797413 = boost
                5.6839767 = idf(docFreq=390, maxDocs=42306)
                0.01217625 = queryNorm
              0.35524854 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6839767 = idf(docFreq=390, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.08336455 = weight(abstract_txt:linguistics in 4397) [ClassicSimilarity], result of:
            0.08336455 = score(doc=4397,freq=2.0), product of:
              0.1412776 = queryWeight, product of:
                1.7379874 = boost
                6.6759505 = idf(docFreq=144, maxDocs=42306)
                0.01217625 = queryNorm
              0.5900762 = fieldWeight in 4397, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6759505 = idf(docFreq=144, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.028827988 = weight(abstract_txt:part in 4397) [ClassicSimilarity], result of:
            0.028827988 = score(doc=4397,freq=1.0), product of:
              0.10038504 = queryWeight, product of:
                1.7942796 = boost
                4.594786 = idf(docFreq=1161, maxDocs=42306)
                0.01217625 = queryNorm
              0.28717414 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.594786 = idf(docFreq=1161, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.08291217 = weight(abstract_txt:language in 4397) [ClassicSimilarity], result of:
            0.08291217 = score(doc=4397,freq=8.0), product of:
              0.111726075 = queryWeight, product of:
                2.1857588 = boost
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.01217625 = queryNorm
              0.7421022 = fieldWeight in 4397, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.07949189 = weight(abstract_txt:computational in 4397) [ClassicSimilarity], result of:
            0.07949189 = score(doc=4397,freq=1.0), product of:
              0.19739735 = queryWeight, product of:
                2.5160904 = boost
                6.443198 = idf(docFreq=182, maxDocs=42306)
                0.01217625 = queryNorm
              0.4026999 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.443198 = idf(docFreq=182, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
          0.2700106 = weight(abstract_txt:expressions in 4397) [ClassicSimilarity], result of:
            0.2700106 = score(doc=4397,freq=2.0), product of:
              0.44604635 = queryWeight, product of:
                5.348852 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01217625 = queryNorm
              0.60534203 = fieldWeight in 4397, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.0625 = fieldNorm(doc=4397)
        0.36 = coord(9/25)
    
  4. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.19
    0.19292909 = sum of:
      0.19292909 = product of:
        0.96464545 = sum of:
          0.029594952 = weight(abstract_txt:framework in 4921) [ClassicSimilarity], result of:
            0.029594952 = score(doc=4921,freq=1.0), product of:
              0.06810515 = queryWeight, product of:
                1.2067018 = boost
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.01217625 = queryNorm
              0.43454796 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.09375 = fieldNorm(doc=4921)
          0.0360063 = weight(abstract_txt:processing in 4921) [ClassicSimilarity], result of:
            0.0360063 = score(doc=4921,freq=1.0), product of:
              0.07761647 = queryWeight, product of:
                1.2882107 = boost
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.01217625 = queryNorm
              0.4639003 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.94827 = idf(docFreq=815, maxDocs=42306)
                0.09375 = fieldNorm(doc=4921)
          0.04397082 = weight(abstract_txt:language in 4921) [ClassicSimilarity], result of:
            0.04397082 = score(doc=4921,freq=1.0), product of:
              0.111726075 = queryWeight, product of:
                2.1857588 = boost
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.01217625 = queryNorm
              0.39355916 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.197964 = idf(docFreq=1727, maxDocs=42306)
                0.09375 = fieldNorm(doc=4921)
          0.2863895 = weight(abstract_txt:expressions in 4921) [ClassicSimilarity], result of:
            0.2863895 = score(doc=4921,freq=1.0), product of:
              0.44604635 = queryWeight, product of:
                5.348852 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01217625 = queryNorm
              0.6420622 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.09375 = fieldNorm(doc=4921)
          0.56868386 = weight(abstract_txt:multiword in 4921) [ClassicSimilarity], result of:
            0.56868386 = score(doc=4921,freq=1.0), product of:
              0.70467556 = queryWeight, product of:
                6.7230325 = boost
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.01217625 = queryNorm
              0.8070152 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.09375 = fieldNorm(doc=4921)
        0.2 = coord(5/25)
    
  5. Ramisch, C.; Schreiner, P.; Idiart, M.; Villavicencio, A.: ¬An evaluation of methods for the extraction of multiword expressions (20xx) 0.18
    0.1842242 = sum of:
      0.1842242 = product of:
        1.1514013 = sum of:
          0.026015628 = weight(abstract_txt:first in 2963) [ClassicSimilarity], result of:
            0.026015628 = score(doc=2963,freq=1.0), product of:
              0.056393173 = queryWeight, product of:
                1.0980529 = boost
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.01217625 = queryNorm
              0.46132585 = fieldWeight in 2963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2178364 = idf(docFreq=1693, maxDocs=42306)
                0.109375 = fieldNorm(doc=2963)
          0.12780005 = weight(abstract_txt:acquisition in 2963) [ClassicSimilarity], result of:
            0.12780005 = score(doc=2963,freq=1.0), product of:
              0.18654692 = queryWeight, product of:
                2.4459617 = boost
                6.2636123 = idf(docFreq=218, maxDocs=42306)
                0.01217625 = queryNorm
              0.6850826 = fieldWeight in 2963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2636123 = idf(docFreq=218, maxDocs=42306)
                0.109375 = fieldNorm(doc=2963)
          0.33412108 = weight(abstract_txt:expressions in 2963) [ClassicSimilarity], result of:
            0.33412108 = score(doc=2963,freq=1.0), product of:
              0.44604635 = queryWeight, product of:
                5.348852 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01217625 = queryNorm
              0.74907255 = fieldWeight in 2963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.109375 = fieldNorm(doc=2963)
          0.6634645 = weight(abstract_txt:multiword in 2963) [ClassicSimilarity], result of:
            0.6634645 = score(doc=2963,freq=1.0), product of:
              0.70467556 = queryWeight, product of:
                6.7230325 = boost
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.01217625 = queryNorm
              0.9415177 = fieldWeight in 2963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.608162 = idf(docFreq=20, maxDocs=42306)
                0.109375 = fieldNorm(doc=2963)
        0.16 = coord(4/25)