Document (#38123)

Author
Manning, C.D.
Title
Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics?
Source
Computational Linguistics and Intelligent Text Processing, 12th International Conference, CICLing 2011, Proceedings, Part I. Ed.: Alexander Gelbukh
Imprint
Berlin : Springer
Year
2011
Pages
S.171-189
Series
Lecture notes in computer science; 6608
Abstract
I examine what would be necessary to move part-of-speech tagging performance from its current level of about 97.3% token accuracy (56% sentence accuracy) to close to 100% accuracy. I suggest that it must still be possible to greatly increase tagging performance and examine some useful improvements that have recently been made to the Stanford Part-of-Speech Tagger. However, an error analysis of some of the remaining errors suggests that there is limited further mileage to be had either from better machine learning or better features in a discriminative sequence classifier. The prospects for further gains from semisupervised learning also seem quite limited. Rather, I suggest and begin to demonstrate that the largest opportunity for further progress comes from improving the taxonomic basis of the linguistic resources from which taggers are trained. That is, from improved descriptive linguistics. However, I conclude by suggesting that there are also limits to this process. The status of some words may not be able to be adequately captured by assigning them to one of a small number of categories. While conventions can be used in such cases to improve tagging consistency, they lack a strong linguistic basis.
Content
Vgl.: http://nlp.stanford.edu/~manning/papers/CICLing2011-manning-tagging.pdf.
Theme
Computerlinguistik

Similar documents (author)

  1. Manning, R.W.: ¬The Anglo-American Cataloguing Rules and their future (1999) 5.74
    5.73624 = sum of:
      5.73624 = weight(author_txt:manning in 810) [ClassicSimilarity], result of:
        5.73624 = score(doc=810,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.10895638 = queryNorm
          5.7362404 = fieldWeight in 810, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.625 = fieldNorm(doc=810)
    
  2. Manning, R.W.: ¬The Anglo American Cataloguing Rules and their future (2000) 5.74
    5.73624 = sum of:
      5.73624 = weight(author_txt:manning in 1315) [ClassicSimilarity], result of:
        5.73624 = score(doc=1315,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.10895638 = queryNorm
          5.7362404 = fieldWeight in 1315, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.625 = fieldNorm(doc=1315)
    
  3. Mallett, J.; Manning, C.: Multimedia and database design : a discussion of database technology and its use in multimedia (1993) 4.59
    4.5889916 = sum of:
      4.5889916 = weight(author_txt:manning in 6277) [ClassicSimilarity], result of:
        4.5889916 = score(doc=6277,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.10895638 = queryNorm
          4.588992 = fieldWeight in 6277, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.5 = fieldNorm(doc=6277)
    
  4. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 4.59
    4.5889916 = sum of:
      4.5889916 = weight(author_txt:manning in 3061) [ClassicSimilarity], result of:
        4.5889916 = score(doc=3061,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.10895638 = queryNorm
          4.588992 = fieldWeight in 3061, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.5 = fieldNorm(doc=3061)
    
  5. Manning, C.D.; Schütze, H.: Foundations of statistical natural language processing (2000) 4.59
    4.5889916 = sum of:
      4.5889916 = weight(author_txt:manning in 3604) [ClassicSimilarity], result of:
        4.5889916 = score(doc=3604,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.10895638 = queryNorm
          4.588992 = fieldWeight in 3604, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            9.177984 = idf(docFreq=11, maxDocs=42740)
            0.5 = fieldNorm(doc=3604)
    

Similar documents (content)

  1. L'Homme, D.; L'Homme, M.-C.; Lemay, C.: Benchmarking the performance of two Part-of-Speech (POS) taggers for terminological purposes (2002) 0.50
    0.50178313 = sum of:
      0.50178313 = product of:
        1.2544578 = sum of:
          0.03241841 = weight(abstract_txt:however in 2856) [ClassicSimilarity], result of:
            0.03241841 = score(doc=2856,freq=1.0), product of:
              0.0973974 = queryWeight, product of:
                1.0617512 = boost
                4.260439 = idf(docFreq=1639, maxDocs=42740)
                0.0215313 = queryNorm
              0.3328468 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.260439 = idf(docFreq=1639, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.3051504 = weight(abstract_txt:taggers in 2856) [ClassicSimilarity], result of:
            0.3051504 = score(doc=2856,freq=5.0), product of:
              0.20154043 = queryWeight, product of:
                1.0799787 = boost
                8.667158 = idf(docFreq=19, maxDocs=42740)
                0.0215313 = queryNorm
              1.5140903 = fieldWeight in 2856, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.667158 = idf(docFreq=19, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.19644083 = weight(abstract_txt:tagger in 2856) [ClassicSimilarity], result of:
            0.19644083 = score(doc=2856,freq=2.0), product of:
              0.20393296 = queryWeight, product of:
                1.0863701 = boost
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.0215313 = queryNorm
              0.96326184 = fieldWeight in 2856, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.04195083 = weight(abstract_txt:performance in 2856) [ClassicSimilarity], result of:
            0.04195083 = score(doc=2856,freq=1.0), product of:
              0.11565913 = queryWeight, product of:
                1.1570152 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0215313 = queryNorm
              0.36271092 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.060524452 = weight(abstract_txt:part in 2856) [ClassicSimilarity], result of:
            0.060524452 = score(doc=2856,freq=1.0), product of:
              0.16904561 = queryWeight, product of:
                1.713155 = boost
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.0215313 = queryNorm
              0.35803622 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.01726954 = weight(abstract_txt:that in 2856) [ClassicSimilarity], result of:
            0.01726954 = score(doc=2856,freq=1.0), product of:
              0.09230966 = queryWeight, product of:
                1.7903309 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0215313 = queryNorm
              0.18708271 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.13629219 = weight(abstract_txt:accuracy in 2856) [ClassicSimilarity], result of:
            0.13629219 = score(doc=2856,freq=1.0), product of:
              0.29042274 = queryWeight, product of:
                2.2454844 = boost
                6.006899 = idf(docFreq=285, maxDocs=42740)
                0.0215313 = queryNorm
              0.46928898 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.006899 = idf(docFreq=285, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.04500157 = weight(abstract_txt:from in 2856) [ClassicSimilarity], result of:
            0.04500157 = score(doc=2856,freq=2.0), product of:
              0.14605531 = queryWeight, product of:
                2.4324381 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0215313 = queryNorm
              0.30811322 = fieldWeight in 2856, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.2081871 = weight(abstract_txt:speech in 2856) [ClassicSimilarity], result of:
            0.2081871 = score(doc=2856,freq=1.0), product of:
              0.38519964 = queryWeight, product of:
                2.5860546 = boost
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.0215313 = queryNorm
              0.5404655 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
          0.21122257 = weight(abstract_txt:tagging in 2856) [ClassicSimilarity], result of:
            0.21122257 = score(doc=2856,freq=1.0), product of:
              0.4280782 = queryWeight, product of:
                3.1479342 = boost
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.0215313 = queryNorm
              0.49342054 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.078125 = fieldNorm(doc=2856)
        0.4 = coord(10/25)
    
  2. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.30
    0.2978435 = sum of:
      0.2978435 = product of:
        1.0637268 = sum of:
          0.23572902 = weight(abstract_txt:tagger in 3061) [ClassicSimilarity], result of:
            0.23572902 = score(doc=3061,freq=2.0), product of:
              0.20393296 = queryWeight, product of:
                1.0863701 = boost
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.0215313 = queryNorm
              1.1559143 = fieldWeight in 3061, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.050340995 = weight(abstract_txt:performance in 3061) [ClassicSimilarity], result of:
            0.050340995 = score(doc=3061,freq=1.0), product of:
              0.11565913 = queryWeight, product of:
                1.1570152 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0215313 = queryNorm
              0.43525308 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.07262935 = weight(abstract_txt:part in 3061) [ClassicSimilarity], result of:
            0.07262935 = score(doc=3061,freq=1.0), product of:
              0.16904561 = queryWeight, product of:
                1.713155 = boost
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.0215313 = queryNorm
              0.42964348 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.16355063 = weight(abstract_txt:accuracy in 3061) [ClassicSimilarity], result of:
            0.16355063 = score(doc=3061,freq=1.0), product of:
              0.29042274 = queryWeight, product of:
                2.2454844 = boost
                6.006899 = idf(docFreq=285, maxDocs=42740)
                0.0215313 = queryNorm
              0.56314677 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.006899 = idf(docFreq=285, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.038185097 = weight(abstract_txt:from in 3061) [ClassicSimilarity], result of:
            0.038185097 = score(doc=3061,freq=1.0), product of:
              0.14605531 = queryWeight, product of:
                2.4324381 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0215313 = queryNorm
              0.26144272 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.24982454 = weight(abstract_txt:speech in 3061) [ClassicSimilarity], result of:
            0.24982454 = score(doc=3061,freq=1.0), product of:
              0.38519964 = queryWeight, product of:
                2.5860546 = boost
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.0215313 = queryNorm
              0.6485586 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
          0.2534671 = weight(abstract_txt:tagging in 3061) [ClassicSimilarity], result of:
            0.2534671 = score(doc=3061,freq=1.0), product of:
              0.4280782 = queryWeight, product of:
                3.1479342 = boost
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.0215313 = queryNorm
              0.5921047 = fieldWeight in 3061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.09375 = fieldNorm(doc=3061)
        0.28 = coord(7/25)
    
  3. Xu, C.; Ma, B.; Chen, X.; Ma, F.: Social tagging in the scholarly world (2013) 0.28
    0.27850235 = sum of:
      0.27850235 = product of:
        0.9946512 = sum of:
          0.02356528 = weight(abstract_txt:there in 3092) [ClassicSimilarity], result of:
            0.02356528 = score(doc=3092,freq=1.0), product of:
              0.09137091 = queryWeight, product of:
                1.0283787 = boost
                4.1265264 = idf(docFreq=1874, maxDocs=42740)
                0.0215313 = queryNorm
              0.2579079 = fieldWeight in 3092, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1265264 = idf(docFreq=1874, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.109173924 = weight(abstract_txt:taggers in 3092) [ClassicSimilarity], result of:
            0.109173924 = score(doc=3092,freq=1.0), product of:
              0.20154043 = queryWeight, product of:
                1.0799787 = boost
                8.667158 = idf(docFreq=19, maxDocs=42740)
                0.0215313 = queryNorm
              0.5416974 = fieldWeight in 3092, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.667158 = idf(docFreq=19, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.04883341 = weight(abstract_txt:suggest in 3092) [ClassicSimilarity], result of:
            0.04883341 = score(doc=3092,freq=1.0), product of:
              0.14851521 = queryWeight, product of:
                1.3110962 = boost
                5.2609735 = idf(docFreq=602, maxDocs=42740)
                0.0215313 = queryNorm
              0.32881084 = fieldWeight in 3092, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2609735 = idf(docFreq=602, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.056433544 = weight(abstract_txt:limited in 3092) [ClassicSimilarity], result of:
            0.056433544 = score(doc=3092,freq=1.0), product of:
              0.16355021 = queryWeight, product of:
                1.3758613 = boost
                5.520853 = idf(docFreq=464, maxDocs=42740)
                0.0215313 = queryNorm
              0.34505332 = fieldWeight in 3092, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.520853 = idf(docFreq=464, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.023929378 = weight(abstract_txt:that in 3092) [ClassicSimilarity], result of:
            0.023929378 = score(doc=3092,freq=3.0), product of:
              0.09230966 = queryWeight, product of:
                1.7903309 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0215313 = queryNorm
              0.2592294 = fieldWeight in 3092, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.036001258 = weight(abstract_txt:from in 3092) [ClassicSimilarity], result of:
            0.036001258 = score(doc=3092,freq=2.0), product of:
              0.14605531 = queryWeight, product of:
                2.4324381 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0215313 = queryNorm
              0.24649057 = fieldWeight in 3092, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
          0.6967144 = weight(abstract_txt:tagging in 3092) [ClassicSimilarity], result of:
            0.6967144 = score(doc=3092,freq=17.0), product of:
              0.4280782 = queryWeight, product of:
                3.1479342 = boost
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.0215313 = queryNorm
              1.62754 = fieldWeight in 3092, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.0625 = fieldNorm(doc=3092)
        0.28 = coord(7/25)
    
  4. Heckner, M.; Mühlbacher, S.; Wolff, C.: Tagging tagging : a classification model for user keywords in scientific bibliography management systems (2007) 0.26
    0.25751257 = sum of:
      0.25751257 = product of:
        0.71531266 = sum of:
          0.019451048 = weight(abstract_txt:however in 2534) [ClassicSimilarity], result of:
            0.019451048 = score(doc=2534,freq=1.0), product of:
              0.0973974 = queryWeight, product of:
                1.0617512 = boost
                4.260439 = idf(docFreq=1639, maxDocs=42740)
                0.0215313 = queryNorm
              0.19970807 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.260439 = idf(docFreq=1639, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.08334279 = weight(abstract_txt:tagger in 2534) [ClassicSimilarity], result of:
            0.08334279 = score(doc=2534,freq=1.0), product of:
              0.20393296 = queryWeight, product of:
                1.0863701 = boost
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.0215313 = queryNorm
              0.4086774 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.026030442 = weight(abstract_txt:basis in 2534) [ClassicSimilarity], result of:
            0.026030442 = score(doc=2534,freq=1.0), product of:
              0.11827866 = queryWeight, product of:
                1.1700443 = boost
                4.694981 = idf(docFreq=1061, maxDocs=42740)
                0.0215313 = queryNorm
              0.22007725 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.694981 = idf(docFreq=1061, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.0504783 = weight(abstract_txt:linguistic in 2534) [ClassicSimilarity], result of:
            0.0504783 = score(doc=2534,freq=1.0), product of:
              0.18393111 = queryWeight, product of:
                1.4590719 = boost
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.0215313 = queryNorm
              0.27444133 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.07449258 = weight(abstract_txt:linguistics in 2534) [ClassicSimilarity], result of:
            0.07449258 = score(doc=2534,freq=1.0), product of:
              0.23841162 = queryWeight, product of:
                1.6611651 = boost
                6.665678 = idf(docFreq=147, maxDocs=42740)
                0.0215313 = queryNorm
              0.31245366 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.665678 = idf(docFreq=147, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.017947033 = weight(abstract_txt:that in 2534) [ClassicSimilarity], result of:
            0.017947033 = score(doc=2534,freq=3.0), product of:
              0.09230966 = queryWeight, product of:
                1.7903309 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0215313 = queryNorm
              0.19442204 = fieldWeight in 2534, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.025184749 = weight(abstract_txt:some in 2534) [ClassicSimilarity], result of:
            0.025184749 = score(doc=2534,freq=1.0), product of:
              0.14577638 = queryWeight, product of:
                1.8369937 = boost
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.0215313 = queryNorm
              0.1727629 = fieldWeight in 2534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.038185097 = weight(abstract_txt:from in 2534) [ClassicSimilarity], result of:
            0.038185097 = score(doc=2534,freq=4.0), product of:
              0.14605531 = queryWeight, product of:
                2.4324381 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0215313 = queryNorm
              0.26144272 = fieldWeight in 2534, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
          0.38020065 = weight(abstract_txt:tagging in 2534) [ClassicSimilarity], result of:
            0.38020065 = score(doc=2534,freq=9.0), product of:
              0.4280782 = queryWeight, product of:
                3.1479342 = boost
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.0215313 = queryNorm
              0.888157 = fieldWeight in 2534, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.315783 = idf(docFreq=209, maxDocs=42740)
                0.046875 = fieldNorm(doc=2534)
        0.36 = coord(9/25)
    
  5. Losee, R.M.: Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering : an empirical basis for grammatical rules (1996) 0.25
    0.25375453 = sum of:
      0.25375453 = product of:
        0.7048737 = sum of:
          0.04195083 = weight(abstract_txt:performance in 4137) [ClassicSimilarity], result of:
            0.04195083 = score(doc=4137,freq=1.0), product of:
              0.11565913 = queryWeight, product of:
                1.1570152 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0215313 = queryNorm
              0.36271092 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.04657809 = weight(abstract_txt:learning in 4137) [ClassicSimilarity], result of:
            0.04657809 = score(doc=4137,freq=1.0), product of:
              0.12401494 = queryWeight, product of:
                1.1980808 = boost
                4.807482 = idf(docFreq=948, maxDocs=42740)
                0.0215313 = queryNorm
              0.3755845 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.807482 = idf(docFreq=948, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.084130496 = weight(abstract_txt:linguistic in 4137) [ClassicSimilarity], result of:
            0.084130496 = score(doc=4137,freq=1.0), product of:
              0.18393111 = queryWeight, product of:
                1.4590719 = boost
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.0215313 = queryNorm
              0.4574022 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.060524452 = weight(abstract_txt:part in 4137) [ClassicSimilarity], result of:
            0.060524452 = score(doc=4137,freq=1.0), product of:
              0.16904561 = queryWeight, product of:
                1.713155 = boost
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.0215313 = queryNorm
              0.35803622 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.582864 = idf(docFreq=1187, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.065869875 = weight(abstract_txt:further in 4137) [ClassicSimilarity], result of:
            0.065869875 = score(doc=4137,freq=1.0), product of:
              0.17885779 = queryWeight, product of:
                1.7621734 = boost
                4.713993 = idf(docFreq=1041, maxDocs=42740)
                0.0215313 = queryNorm
              0.3682807 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.713993 = idf(docFreq=1041, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.024422819 = weight(abstract_txt:that in 4137) [ClassicSimilarity], result of:
            0.024422819 = score(doc=4137,freq=2.0), product of:
              0.09230966 = queryWeight, product of:
                1.7903309 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0215313 = queryNorm
              0.2645749 = fieldWeight in 4137, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.041974586 = weight(abstract_txt:some in 4137) [ClassicSimilarity], result of:
            0.041974586 = score(doc=4137,freq=1.0), product of:
              0.14577638 = queryWeight, product of:
                1.8369937 = boost
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.0215313 = queryNorm
              0.28793818 = fieldWeight in 4137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.04500157 = weight(abstract_txt:from in 4137) [ClassicSimilarity], result of:
            0.04500157 = score(doc=4137,freq=2.0), product of:
              0.14605531 = queryWeight, product of:
                2.4324381 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0215313 = queryNorm
              0.30811322 = fieldWeight in 4137, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
          0.29442102 = weight(abstract_txt:speech in 4137) [ClassicSimilarity], result of:
            0.29442102 = score(doc=4137,freq=2.0), product of:
              0.38519964 = queryWeight, product of:
                2.5860546 = boost
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.0215313 = queryNorm
              0.7643336 = fieldWeight in 4137, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9179583 = idf(docFreq=114, maxDocs=42740)
                0.078125 = fieldNorm(doc=4137)
        0.36 = coord(9/25)