Document (#35302)

Author
Dolamic, L.
Savoy, J.
Title
Indexing and searching strategies for the Russian language
Source
Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2540-2547
Year
2009
Abstract
This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
Theme
Automatisches Indexieren

Similar documents (author)

  1. Savoy, J.: Stemming of French words based on grammatical categories (1993) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:savoy in 4650) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 4650, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=4650)
    
  2. Savoy, J.: Effectiveness of information retrieval systems used in a hypertext environment (1993) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:savoy in 6511) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 6511, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=6511)
    
  3. Savoy, J.: ¬A learning scheme for information retrieval in hypertext (1994) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:savoy in 7292) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 7292, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=7292)
    
  4. Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:savoy in 192) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 192, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=192)
    
  5. Savoy, J.: Searching information in legal hypertext systems (1993/94) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:savoy in 757) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 757, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=757)
    

Similar documents (content)

  1. Savoy, J.: Searching strategies for the Hungarian language (2008) 1.19
    1.1880672 = sum of:
      1.1880672 = product of:
        1.856355 = sum of:
          0.042258043 = weight(abstract_txt:space in 2037) [ClassicSimilarity], result of:
            0.042258043 = score(doc=2037,freq=1.0), product of:
              0.10030767 = queryWeight, product of:
                1.2273066 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.015156394 = queryNorm
              0.42128426 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.036785375 = weight(abstract_txt:approach in 2037) [ClassicSimilarity], result of:
            0.036785375 = score(doc=2037,freq=3.0), product of:
              0.07258297 = queryWeight, product of:
                1.278642 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015156394 = queryNorm
              0.5068045 = fieldWeight in 2037, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.023888916 = weight(abstract_txt:than in 2037) [ClassicSimilarity], result of:
            0.023888916 = score(doc=2037,freq=1.0), product of:
              0.07850355 = queryWeight, product of:
                1.3297693 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.015156394 = queryNorm
              0.30430365 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.07472123 = weight(abstract_txt:vector in 2037) [ClassicSimilarity], result of:
            0.07472123 = score(doc=2037,freq=1.0), product of:
              0.1466754 = queryWeight, product of:
                1.484105 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.015156394 = queryNorm
              0.5094326 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.08555193 = weight(abstract_txt:statistically in 2037) [ClassicSimilarity], result of:
            0.08555193 = score(doc=2037,freq=1.0), product of:
              0.16052689 = queryWeight, product of:
                1.5526011 = boost
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.015156394 = queryNorm
              0.53294456 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.04013343 = weight(abstract_txt:performance in 2037) [ClassicSimilarity], result of:
            0.04013343 = score(doc=2037,freq=1.0), product of:
              0.11094196 = queryWeight, product of:
                1.5808095 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015156394 = queryNorm
              0.3617516 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.04042416 = weight(abstract_txt:models in 2037) [ClassicSimilarity], result of:
            0.04042416 = score(doc=2037,freq=1.0), product of:
              0.1114771 = queryWeight, product of:
                1.5846175 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.015156394 = queryNorm
              0.362623 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.043800764 = weight(abstract_txt:significant in 2037) [ClassicSimilarity], result of:
            0.043800764 = score(doc=2037,freq=1.0), product of:
              0.11760148 = queryWeight, product of:
                1.6275638 = boost
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.015156394 = queryNorm
              0.3724508 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.07687512 = weight(abstract_txt:strategies in 2037) [ClassicSimilarity], result of:
            0.07687512 = score(doc=2037,freq=2.0), product of:
              0.13581224 = queryWeight, product of:
                1.7490454 = boost
                5.123207 = idf(docFreq=715, maxDocs=44218)
                0.015156394 = queryNorm
              0.56603974 = fieldWeight in 2037, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.123207 = idf(docFreq=715, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.13839719 = weight(abstract_txt:okapi in 2037) [ClassicSimilarity], result of:
            0.13839719 = score(doc=2037,freq=1.0), product of:
              0.22121407 = queryWeight, product of:
                1.8226043 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.015156394 = queryNorm
              0.6256256 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.06828476 = weight(abstract_txt:language in 2037) [ClassicSimilarity], result of:
            0.06828476 = score(doc=2037,freq=3.0), product of:
              0.120664634 = queryWeight, product of:
                1.9036671 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.015156394 = queryNorm
              0.56590533 = fieldWeight in 2037, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.119315445 = weight(abstract_txt:light in 2037) [ClassicSimilarity], result of:
            0.119315445 = score(doc=2037,freq=2.0), product of:
              0.18205927 = queryWeight, product of:
                2.0250607 = boost
                5.931696 = idf(docFreq=318, maxDocs=44218)
                0.015156394 = queryNorm
              0.65536594 = fieldWeight in 2037, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.931696 = idf(docFreq=318, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.21067496 = weight(abstract_txt:stemmer in 2037) [ClassicSimilarity], result of:
            0.21067496 = score(doc=2037,freq=1.0), product of:
              0.29273176 = queryWeight, product of:
                2.0966258 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015156394 = queryNorm
              0.71968603 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.21067496 = weight(abstract_txt:aggressive in 2037) [ClassicSimilarity], result of:
            0.21067496 = score(doc=2037,freq=1.0), product of:
              0.29273176 = queryWeight, product of:
                2.0966258 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015156394 = queryNorm
              0.71968603 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.065908894 = weight(abstract_txt:differences in 2037) [ClassicSimilarity], result of:
            0.065908894 = score(doc=2037,freq=1.0), product of:
              0.16996804 = queryWeight, product of:
                2.2593558 = boost
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.015156394 = queryNorm
              0.3877723 = fieldWeight in 2037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
          0.5786597 = weight(abstract_txt:stemming in 2037) [ClassicSimilarity], result of:
            0.5786597 = score(doc=2037,freq=3.0), product of:
              0.57412976 = queryWeight, product of:
                5.085711 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015156394 = queryNorm
              1.0078901 = fieldWeight in 2037, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=2037)
        0.64 = coord(16/25)
    
  2. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.52
    0.5232733 = sum of:
      0.5232733 = product of:
        1.3081832 = sum of:
          0.0395923 = weight(abstract_txt:various in 2950) [ClassicSimilarity], result of:
            0.0395923 = score(doc=2950,freq=3.0), product of:
              0.06659291 = queryWeight, product of:
                4.3937173 = idf(docFreq=1484, maxDocs=44218)
                0.015156394 = queryNorm
              0.59454226 = fieldWeight in 2950, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3937173 = idf(docFreq=1484, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.0395656 = weight(abstract_txt:approaches in 2950) [ClassicSimilarity], result of:
            0.0395656 = score(doc=2950,freq=1.0), product of:
              0.10989303 = queryWeight, product of:
                1.5733187 = boost
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.015156394 = queryNorm
              0.3600374 = fieldWeight in 2950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.056757234 = weight(abstract_txt:performance in 2950) [ClassicSimilarity], result of:
            0.056757234 = score(doc=2950,freq=2.0), product of:
              0.11094196 = queryWeight, product of:
                1.5808095 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015156394 = queryNorm
              0.51159394 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.04042416 = weight(abstract_txt:models in 2950) [ClassicSimilarity], result of:
            0.04042416 = score(doc=2950,freq=1.0), product of:
              0.1114771 = queryWeight, product of:
                1.5846175 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.015156394 = queryNorm
              0.362623 = fieldWeight in 2950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.043800764 = weight(abstract_txt:significant in 2950) [ClassicSimilarity], result of:
            0.043800764 = score(doc=2950,freq=1.0), product of:
              0.11760148 = queryWeight, product of:
                1.6275638 = boost
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.015156394 = queryNorm
              0.3724508 = fieldWeight in 2950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.05575427 = weight(abstract_txt:language in 2950) [ClassicSimilarity], result of:
            0.05575427 = score(doc=2950,freq=2.0), product of:
              0.120664634 = queryWeight, product of:
                1.9036671 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.015156394 = queryNorm
              0.46205974 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.28323132 = weight(abstract_txt:stemmers in 2950) [ClassicSimilarity], result of:
            0.28323132 = score(doc=2950,freq=2.0), product of:
              0.2830167 = queryWeight, product of:
                2.0615413 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.015156394 = queryNorm
              1.0007583 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.21067496 = weight(abstract_txt:stemmer in 2950) [ClassicSimilarity], result of:
            0.21067496 = score(doc=2950,freq=1.0), product of:
              0.29273176 = queryWeight, product of:
                2.0966258 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015156394 = queryNorm
              0.71968603 = fieldWeight in 2950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.065908894 = weight(abstract_txt:differences in 2950) [ClassicSimilarity], result of:
            0.065908894 = score(doc=2950,freq=1.0), product of:
              0.16996804 = queryWeight, product of:
                2.2593558 = boost
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.015156394 = queryNorm
              0.3877723 = fieldWeight in 2950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.47247365 = weight(abstract_txt:stemming in 2950) [ClassicSimilarity], result of:
            0.47247365 = score(doc=2950,freq=2.0), product of:
              0.57412976 = queryWeight, product of:
                5.085711 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015156394 = queryNorm
              0.8229388 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
        0.4 = coord(10/25)
    
  3. Dolamic, L.; Savoy, J.: When stopword lists make the difference (2009) 0.37
    0.37119502 = sum of:
      0.37119502 = product of:
        0.773323 = sum of:
          0.108350635 = weight(abstract_txt:randomness in 3319) [ClassicSimilarity], result of:
            0.108350635 = score(doc=3319,freq=1.0), product of:
              0.14914392 = queryWeight, product of:
                1.0582147 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.015156394 = queryNorm
              0.72648376 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.035801083 = weight(abstract_txt:result in 3319) [ClassicSimilarity], result of:
            0.035801083 = score(doc=3319,freq=1.0), product of:
              0.08981013 = queryWeight, product of:
                1.1613114 = boost
                5.1024737 = idf(docFreq=730, maxDocs=44218)
                0.015156394 = queryNorm
              0.39863077 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1024737 = idf(docFreq=730, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.04077722 = weight(abstract_txt:evaluate in 3319) [ClassicSimilarity], result of:
            0.04077722 = score(doc=3319,freq=1.0), product of:
              0.09795042 = queryWeight, product of:
                1.2127999 = boost
                5.3287 = idf(docFreq=582, maxDocs=44218)
                0.015156394 = queryNorm
              0.4163047 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3287 = idf(docFreq=582, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.021238048 = weight(abstract_txt:approach in 3319) [ClassicSimilarity], result of:
            0.021238048 = score(doc=3319,freq=1.0), product of:
              0.07258297 = queryWeight, product of:
                1.278642 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015156394 = queryNorm
              0.29260373 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.08555193 = weight(abstract_txt:statistically in 3319) [ClassicSimilarity], result of:
            0.08555193 = score(doc=3319,freq=1.0), product of:
              0.16052689 = queryWeight, product of:
                1.5526011 = boost
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.015156394 = queryNorm
              0.53294456 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.0395656 = weight(abstract_txt:approaches in 3319) [ClassicSimilarity], result of:
            0.0395656 = score(doc=3319,freq=1.0), product of:
              0.10989303 = queryWeight, product of:
                1.5733187 = boost
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.015156394 = queryNorm
              0.3600374 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.056757234 = weight(abstract_txt:performance in 3319) [ClassicSimilarity], result of:
            0.056757234 = score(doc=3319,freq=2.0), product of:
              0.11094196 = queryWeight, product of:
                1.5808095 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015156394 = queryNorm
              0.51159394 = fieldWeight in 3319, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.04042416 = weight(abstract_txt:models in 3319) [ClassicSimilarity], result of:
            0.04042416 = score(doc=3319,freq=1.0), product of:
              0.1114771 = queryWeight, product of:
                1.5846175 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.015156394 = queryNorm
              0.362623 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.043800764 = weight(abstract_txt:significant in 3319) [ClassicSimilarity], result of:
            0.043800764 = score(doc=3319,freq=1.0), product of:
              0.11760148 = queryWeight, product of:
                1.6275638 = boost
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.015156394 = queryNorm
              0.3724508 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.19572319 = weight(abstract_txt:okapi in 3319) [ClassicSimilarity], result of:
            0.19572319 = score(doc=3319,freq=2.0), product of:
              0.22121407 = queryWeight, product of:
                1.8226043 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.015156394 = queryNorm
              0.88476825 = fieldWeight in 3319, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.039424222 = weight(abstract_txt:language in 3319) [ClassicSimilarity], result of:
            0.039424222 = score(doc=3319,freq=1.0), product of:
              0.120664634 = queryWeight, product of:
                1.9036671 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.015156394 = queryNorm
              0.32672557 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
          0.065908894 = weight(abstract_txt:differences in 3319) [ClassicSimilarity], result of:
            0.065908894 = score(doc=3319,freq=1.0), product of:
              0.16996804 = queryWeight, product of:
                2.2593558 = boost
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.015156394 = queryNorm
              0.3877723 = fieldWeight in 3319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.078125 = fieldNorm(doc=3319)
        0.48 = coord(12/25)
    
  4. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.34
    0.3432734 = sum of:
      0.3432734 = product of:
        1.716367 = sum of:
          0.04729764 = weight(abstract_txt:than in 2585) [ClassicSimilarity], result of:
            0.04729764 = score(doc=2585,freq=2.0), product of:
              0.07850355 = queryWeight, product of:
                1.3297693 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.015156394 = queryNorm
              0.6024905 = fieldWeight in 2585, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.0561868 = weight(abstract_txt:performance in 2585) [ClassicSimilarity], result of:
            0.0561868 = score(doc=2585,freq=1.0), product of:
              0.11094196 = queryWeight, product of:
                1.5808095 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015156394 = queryNorm
              0.5064522 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.48564053 = weight(abstract_txt:stemmers in 2585) [ClassicSimilarity], result of:
            0.48564053 = score(doc=2585,freq=3.0), product of:
              0.2830167 = queryWeight, product of:
                2.0615413 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.015156394 = queryNorm
              1.715943 = fieldWeight in 2585, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.65951693 = weight(abstract_txt:stemmer in 2585) [ClassicSimilarity], result of:
            0.65951693 = score(doc=2585,freq=5.0), product of:
              0.29273176 = queryWeight, product of:
                2.0966258 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015156394 = queryNorm
              2.2529736 = fieldWeight in 2585, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.46772507 = weight(abstract_txt:stemming in 2585) [ClassicSimilarity], result of:
            0.46772507 = score(doc=2585,freq=1.0), product of:
              0.57412976 = queryWeight, product of:
                5.085711 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015156394 = queryNorm
              0.8146679 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
        0.2 = coord(5/25)
    
  5. Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.28
    0.28470546 = sum of:
      0.28470546 = product of:
        0.79084843 = sum of:
          0.014866634 = weight(abstract_txt:approach in 4395) [ClassicSimilarity], result of:
            0.014866634 = score(doc=4395,freq=1.0), product of:
              0.07258297 = queryWeight, product of:
                1.278642 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015156394 = queryNorm
              0.20482263 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.016722241 = weight(abstract_txt:than in 4395) [ClassicSimilarity], result of:
            0.016722241 = score(doc=4395,freq=1.0), product of:
              0.07850355 = queryWeight, product of:
                1.3297693 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.015156394 = queryNorm
              0.21301256 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.05988635 = weight(abstract_txt:statistically in 4395) [ClassicSimilarity], result of:
            0.05988635 = score(doc=4395,freq=1.0), product of:
              0.16052689 = queryWeight, product of:
                1.5526011 = boost
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.015156394 = queryNorm
              0.37306118 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.0280934 = weight(abstract_txt:performance in 4395) [ClassicSimilarity], result of:
            0.0280934 = score(doc=4395,freq=1.0), product of:
              0.11094196 = queryWeight, product of:
                1.5808095 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015156394 = queryNorm
              0.2532261 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.030660532 = weight(abstract_txt:significant in 4395) [ClassicSimilarity], result of:
            0.030660532 = score(doc=4395,freq=1.0), product of:
              0.11760148 = queryWeight, product of:
                1.6275638 = boost
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.015156394 = queryNorm
              0.26071554 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.055193912 = weight(abstract_txt:language in 4395) [ClassicSimilarity], result of:
            0.055193912 = score(doc=4395,freq=4.0), product of:
              0.120664634 = queryWeight, product of:
                1.9036671 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.015156394 = queryNorm
              0.45741582 = fieldWeight in 4395, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.20855756 = weight(abstract_txt:stemmer in 4395) [ClassicSimilarity], result of:
            0.20855756 = score(doc=4395,freq=2.0), product of:
              0.29273176 = queryWeight, product of:
                2.0966258 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015156394 = queryNorm
              0.71245277 = fieldWeight in 4395, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.046136227 = weight(abstract_txt:differences in 4395) [ClassicSimilarity], result of:
            0.046136227 = score(doc=4395,freq=1.0), product of:
              0.16996804 = queryWeight, product of:
                2.2593558 = boost
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.015156394 = queryNorm
              0.2714406 = fieldWeight in 4395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9634852 = idf(docFreq=839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
          0.33073157 = weight(abstract_txt:stemming in 4395) [ClassicSimilarity], result of:
            0.33073157 = score(doc=4395,freq=2.0), product of:
              0.57412976 = queryWeight, product of:
                5.085711 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015156394 = queryNorm
              0.5760572 = fieldWeight in 4395, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4395)
        0.36 = coord(9/25)