Document (#35303)

Author
Dolamic, L.
Savoy, J.
Title
Indexing and searching strategies for the Russian language
Source
Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2540-2547
Year
2009
Abstract
This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
Theme
Automatisches Indexieren

Similar documents (author)

  1. Savoy, J.: Stemming of French words based on grammatical categories (1993) 5.21
    5.2141504 = sum of:
      5.2141504 = weight(author_txt:savoy in 4650) [ClassicSimilarity], result of:
        5.2141504 = fieldWeight in 4650, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.342641 = idf(docFreq=27, maxDocs=43254)
          0.625 = fieldNorm(doc=4650)
    
  2. Savoy, J.: Effectiveness of information retrieval systems used in a hypertext environment (1993) 5.21
    5.2141504 = sum of:
      5.2141504 = weight(author_txt:savoy in 6511) [ClassicSimilarity], result of:
        5.2141504 = fieldWeight in 6511, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.342641 = idf(docFreq=27, maxDocs=43254)
          0.625 = fieldNorm(doc=6511)
    
  3. Savoy, J.: ¬A learning scheme for information retrieval in hypertext (1994) 5.21
    5.2141504 = sum of:
      5.2141504 = weight(author_txt:savoy in 292) [ClassicSimilarity], result of:
        5.2141504 = fieldWeight in 292, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.342641 = idf(docFreq=27, maxDocs=43254)
          0.625 = fieldNorm(doc=292)
    
  4. Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 5.21
    5.2141504 = sum of:
      5.2141504 = weight(author_txt:savoy in 1261) [ClassicSimilarity], result of:
        5.2141504 = fieldWeight in 1261, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.342641 = idf(docFreq=27, maxDocs=43254)
          0.625 = fieldNorm(doc=1261)
    
  5. Savoy, J.: Searching information in legal hypertext systems (1993/94) 5.21
    5.2141504 = sum of:
      5.2141504 = weight(author_txt:savoy in 1826) [ClassicSimilarity], result of:
        5.2141504 = fieldWeight in 1826, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.342641 = idf(docFreq=27, maxDocs=43254)
          0.625 = fieldNorm(doc=1826)
    

Similar documents (content)

  1. Savoy, J.: Searching strategies for the Hungarian language (2008) 1.19
    1.1873072 = sum of:
      1.1873072 = product of:
        1.8551676 = sum of:
          0.042208973 = weight(abstract_txt:space in 4038) [ClassicSimilarity], result of:
            0.042208973 = score(doc=4038,freq=1.0), product of:
              0.10015385 = queryWeight, product of:
                1.2230052 = boost
                5.394449 = idf(docFreq=533, maxDocs=43254)
                0.015180715 = queryNorm
              0.42144135 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.394449 = idf(docFreq=533, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.037011858 = weight(abstract_txt:approach in 4038) [ClassicSimilarity], result of:
            0.037011858 = score(doc=4038,freq=3.0), product of:
              0.07282521 = queryWeight, product of:
                1.2772648 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015180715 = queryNorm
              0.50822866 = fieldWeight in 4038, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.024017762 = weight(abstract_txt:than in 4038) [ClassicSimilarity], result of:
            0.024017762 = score(doc=4038,freq=1.0), product of:
              0.0787257 = queryWeight, product of:
                1.3280009 = boost
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.015180715 = queryNorm
              0.30508158 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.07438144 = weight(abstract_txt:vector in 4038) [ClassicSimilarity], result of:
            0.07438144 = score(doc=4038,freq=1.0), product of:
              0.14611927 = queryWeight, product of:
                1.4772303 = boost
                6.5157895 = idf(docFreq=173, maxDocs=43254)
                0.015180715 = queryNorm
              0.5090461 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5157895 = idf(docFreq=173, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.08510738 = weight(abstract_txt:statistically in 4038) [ClassicSimilarity], result of:
            0.08510738 = score(doc=4038,freq=1.0), product of:
              0.15984876 = queryWeight, product of:
                1.5450734 = boost
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.015180715 = queryNorm
              0.5324244 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.040234227 = weight(abstract_txt:performance in 4038) [ClassicSimilarity], result of:
            0.040234227 = score(doc=4038,freq=1.0), product of:
              0.1110432 = queryWeight, product of:
                1.5771976 = boost
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.015180715 = queryNorm
              0.36232948 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.041341946 = weight(abstract_txt:models in 4038) [ClassicSimilarity], result of:
            0.041341946 = score(doc=4038,freq=1.0), product of:
              0.113072105 = queryWeight, product of:
                1.591541 = boost
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.015180715 = queryNorm
              0.3656246 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.044251155 = weight(abstract_txt:significant in 4038) [ClassicSimilarity], result of:
            0.044251155 = score(doc=4038,freq=1.0), product of:
              0.11831631 = queryWeight, product of:
                1.62803 = boost
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.015180715 = queryNorm
              0.37400723 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.07750446 = weight(abstract_txt:strategies in 4038) [ClassicSimilarity], result of:
            0.07750446 = score(doc=4038,freq=2.0), product of:
              0.13644868 = queryWeight, product of:
                1.7483355 = boost
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.015180715 = queryNorm
              0.5680118 = fieldWeight in 4038, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.13694488 = weight(abstract_txt:okapi in 4038) [ClassicSimilarity], result of:
            0.13694488 = score(doc=4038,freq=1.0), product of:
              0.21949686 = queryWeight, product of:
                1.8105421 = boost
                7.9859657 = idf(docFreq=39, maxDocs=43254)
                0.015180715 = queryNorm
              0.6239036 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9859657 = idf(docFreq=39, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.06862281 = weight(abstract_txt:language in 4038) [ClassicSimilarity], result of:
            0.06862281 = score(doc=4038,freq=3.0), product of:
              0.12097057 = queryWeight, product of:
                1.9008565 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.015180715 = queryNorm
              0.56726867 = fieldWeight in 4038, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.12082354 = weight(abstract_txt:light in 4038) [ClassicSimilarity], result of:
            0.12082354 = score(doc=4038,freq=2.0), product of:
              0.18345064 = queryWeight, product of:
                2.0272145 = boost
                5.961112 = idf(docFreq=302, maxDocs=43254)
                0.015180715 = queryNorm
              0.65861607 = fieldWeight in 4038, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.961112 = idf(docFreq=302, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.20868981 = weight(abstract_txt:aggressive in 4038) [ClassicSimilarity], result of:
            0.20868981 = score(doc=4038,freq=1.0), product of:
              0.29066893 = queryWeight, product of:
                2.0835013 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.015180715 = queryNorm
              0.71796393 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.20868981 = weight(abstract_txt:stemmer in 4038) [ClassicSimilarity], result of:
            0.20868981 = score(doc=4038,freq=1.0), product of:
              0.29066893 = queryWeight, product of:
                2.0835013 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.015180715 = queryNorm
              0.71796393 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.06637917 = weight(abstract_txt:differences in 4038) [ClassicSimilarity], result of:
            0.06637917 = score(doc=4038,freq=1.0), product of:
              0.17064582 = queryWeight, product of:
                2.2576535 = boost
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.015180715 = queryNorm
              0.38898796 = fieldWeight in 4038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
          0.5789582 = weight(abstract_txt:stemming in 4038) [ClassicSimilarity], result of:
            0.5789582 = score(doc=4038,freq=3.0), product of:
              0.5738908 = queryWeight, product of:
                5.070721 = boost
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.015180715 = queryNorm
              1.00883 = fieldWeight in 4038, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.078125 = fieldNorm(doc=4038)
        0.64 = coord(16/25)
    
  2. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.52
    0.52284235 = sum of:
      0.52284235 = product of:
        1.3071058 = sum of:
          0.039965063 = weight(abstract_txt:various in 4951) [ClassicSimilarity], result of:
            0.039965063 = score(doc=4951,freq=3.0), product of:
              0.06695932 = queryWeight, product of:
                4.410815 = idf(docFreq=1427, maxDocs=43254)
                0.015180715 = queryNorm
              0.5968559 = fieldWeight in 4951, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.410815 = idf(docFreq=1427, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.056899782 = weight(abstract_txt:performance in 4951) [ClassicSimilarity], result of:
            0.056899782 = score(doc=4951,freq=2.0), product of:
              0.1110432 = queryWeight, product of:
                1.5771976 = boost
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.015180715 = queryNorm
              0.51241124 = fieldWeight in 4951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.040302962 = weight(abstract_txt:approaches in 4951) [ClassicSimilarity], result of:
            0.040302962 = score(doc=4951,freq=1.0), product of:
              0.111169636 = queryWeight, product of:
                1.5780952 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.015180715 = queryNorm
              0.36253572 = fieldWeight in 4951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.041341946 = weight(abstract_txt:models in 4951) [ClassicSimilarity], result of:
            0.041341946 = score(doc=4951,freq=1.0), product of:
              0.113072105 = queryWeight, product of:
                1.591541 = boost
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.015180715 = queryNorm
              0.3656246 = fieldWeight in 4951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.044251155 = weight(abstract_txt:significant in 4951) [ClassicSimilarity], result of:
            0.044251155 = score(doc=4951,freq=1.0), product of:
              0.11831631 = queryWeight, product of:
                1.62803 = boost
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.015180715 = queryNorm
              0.37400723 = fieldWeight in 4951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.056030296 = weight(abstract_txt:language in 4951) [ClassicSimilarity], result of:
            0.056030296 = score(doc=4951,freq=2.0), product of:
              0.12097057 = queryWeight, product of:
                1.9008565 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.015180715 = queryNorm
              0.46317294 = fieldWeight in 4951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.2805282 = weight(abstract_txt:stemmers in 4951) [ClassicSimilarity], result of:
            0.2805282 = score(doc=4951,freq=2.0), product of:
              0.28099945 = queryWeight, product of:
                2.048553 = boost
                9.035788 = idf(docFreq=13, maxDocs=43254)
                0.015180715 = queryNorm
              0.9983229 = fieldWeight in 4951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.035788 = idf(docFreq=13, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.20868981 = weight(abstract_txt:stemmer in 4951) [ClassicSimilarity], result of:
            0.20868981 = score(doc=4951,freq=1.0), product of:
              0.29066893 = queryWeight, product of:
                2.0835013 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.015180715 = queryNorm
              0.71796393 = fieldWeight in 4951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.06637917 = weight(abstract_txt:differences in 4951) [ClassicSimilarity], result of:
            0.06637917 = score(doc=4951,freq=1.0), product of:
              0.17064582 = queryWeight, product of:
                2.2576535 = boost
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.015180715 = queryNorm
              0.38898796 = fieldWeight in 4951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
          0.47271737 = weight(abstract_txt:stemming in 4951) [ClassicSimilarity], result of:
            0.47271737 = score(doc=4951,freq=2.0), product of:
              0.5738908 = queryWeight, product of:
                5.070721 = boost
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.015180715 = queryNorm
              0.82370615 = fieldWeight in 4951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.078125 = fieldNorm(doc=4951)
        0.4 = coord(10/25)
    
  3. Dolamic, L.; Savoy, J.: When stopword lists make the difference (2009) 0.37
    0.3712111 = sum of:
      0.3712111 = product of:
        0.7733565 = sum of:
          0.1073369 = weight(abstract_txt:randomness in 320) [ClassicSimilarity], result of:
            0.1073369 = score(doc=320,freq=1.0), product of:
              0.14809957 = queryWeight, product of:
                1.051614 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.015180715 = queryNorm
              0.7247617 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.03601802 = weight(abstract_txt:result in 320) [ClassicSimilarity], result of:
            0.03601802 = score(doc=320,freq=1.0), product of:
              0.09010406 = queryWeight, product of:
                1.1600231 = boost
                5.1166472 = idf(docFreq=704, maxDocs=43254)
                0.015180715 = queryNorm
              0.39973807 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1166472 = idf(docFreq=704, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.04106169 = weight(abstract_txt:evaluate in 320) [ClassicSimilarity], result of:
            0.04106169 = score(doc=320,freq=1.0), product of:
              0.09833067 = queryWeight, product of:
                1.2118224 = boost
                5.3451242 = idf(docFreq=560, maxDocs=43254)
                0.015180715 = queryNorm
              0.41758782 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3451242 = idf(docFreq=560, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.021368807 = weight(abstract_txt:approach in 320) [ClassicSimilarity], result of:
            0.021368807 = score(doc=320,freq=1.0), product of:
              0.07282521 = queryWeight, product of:
                1.2772648 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015180715 = queryNorm
              0.29342598 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.08510738 = weight(abstract_txt:statistically in 320) [ClassicSimilarity], result of:
            0.08510738 = score(doc=320,freq=1.0), product of:
              0.15984876 = queryWeight, product of:
                1.5450734 = boost
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.015180715 = queryNorm
              0.5324244 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.056899782 = weight(abstract_txt:performance in 320) [ClassicSimilarity], result of:
            0.056899782 = score(doc=320,freq=2.0), product of:
              0.1110432 = queryWeight, product of:
                1.5771976 = boost
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.015180715 = queryNorm
              0.51241124 = fieldWeight in 320, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.040302962 = weight(abstract_txt:approaches in 320) [ClassicSimilarity], result of:
            0.040302962 = score(doc=320,freq=1.0), product of:
              0.111169636 = queryWeight, product of:
                1.5780952 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.015180715 = queryNorm
              0.36253572 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.041341946 = weight(abstract_txt:models in 320) [ClassicSimilarity], result of:
            0.041341946 = score(doc=320,freq=1.0), product of:
              0.113072105 = queryWeight, product of:
                1.591541 = boost
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.015180715 = queryNorm
              0.3656246 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.679995 = idf(docFreq=1090, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.044251155 = weight(abstract_txt:significant in 320) [ClassicSimilarity], result of:
            0.044251155 = score(doc=320,freq=1.0), product of:
              0.11831631 = queryWeight, product of:
                1.62803 = boost
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.015180715 = queryNorm
              0.37400723 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.19366929 = weight(abstract_txt:okapi in 320) [ClassicSimilarity], result of:
            0.19366929 = score(doc=320,freq=2.0), product of:
              0.21949686 = queryWeight, product of:
                1.8105421 = boost
                7.9859657 = idf(docFreq=39, maxDocs=43254)
                0.015180715 = queryNorm
              0.88233286 = fieldWeight in 320, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.9859657 = idf(docFreq=39, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.0396194 = weight(abstract_txt:language in 320) [ClassicSimilarity], result of:
            0.0396194 = score(doc=320,freq=1.0), product of:
              0.12097057 = queryWeight, product of:
                1.9008565 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.015180715 = queryNorm
              0.32751274 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
          0.06637917 = weight(abstract_txt:differences in 320) [ClassicSimilarity], result of:
            0.06637917 = score(doc=320,freq=1.0), product of:
              0.17064582 = queryWeight, product of:
                2.2576535 = boost
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.015180715 = queryNorm
              0.38898796 = fieldWeight in 320, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.078125 = fieldNorm(doc=320)
        0.48 = coord(12/25)
    
  4. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.34
    0.34123102 = sum of:
      0.34123102 = product of:
        1.7061551 = sum of:
          0.04755274 = weight(abstract_txt:than in 4586) [ClassicSimilarity], result of:
            0.04755274 = score(doc=4586,freq=2.0), product of:
              0.0787257 = queryWeight, product of:
                1.3280009 = boost
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.015180715 = queryNorm
              0.60403067 = fieldWeight in 4586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.109375 = fieldNorm(doc=4586)
          0.056327917 = weight(abstract_txt:performance in 4586) [ClassicSimilarity], result of:
            0.056327917 = score(doc=4586,freq=1.0), product of:
              0.1110432 = queryWeight, product of:
                1.5771976 = boost
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.015180715 = queryNorm
              0.5072613 = fieldWeight in 4586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.109375 = fieldNorm(doc=4586)
          0.48100564 = weight(abstract_txt:stemmers in 4586) [ClassicSimilarity], result of:
            0.48100564 = score(doc=4586,freq=3.0), product of:
              0.28099945 = queryWeight, product of:
                2.048553 = boost
                9.035788 = idf(docFreq=13, maxDocs=43254)
                0.015180715 = queryNorm
              1.7117672 = fieldWeight in 4586, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.035788 = idf(docFreq=13, maxDocs=43254)
                0.109375 = fieldNorm(doc=4586)
          0.6533025 = weight(abstract_txt:stemmer in 4586) [ClassicSimilarity], result of:
            0.6533025 = score(doc=4586,freq=5.0), product of:
              0.29066893 = queryWeight, product of:
                2.0835013 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.015180715 = queryNorm
              2.2475827 = fieldWeight in 4586, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.109375 = fieldNorm(doc=4586)
          0.46796638 = weight(abstract_txt:stemming in 4586) [ClassicSimilarity], result of:
            0.46796638 = score(doc=4586,freq=1.0), product of:
              0.5738908 = queryWeight, product of:
                5.070721 = boost
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.015180715 = queryNorm
              0.81542754 = fieldWeight in 4586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.109375 = fieldNorm(doc=4586)
        0.2 = coord(5/25)
    
  5. Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.28
    0.28436857 = sum of:
      0.28436857 = product of:
        0.78991264 = sum of:
          0.014958166 = weight(abstract_txt:approach in 396) [ClassicSimilarity], result of:
            0.014958166 = score(doc=396,freq=1.0), product of:
              0.07282521 = queryWeight, product of:
                1.2772648 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015180715 = queryNorm
              0.20539819 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.016812433 = weight(abstract_txt:than in 396) [ClassicSimilarity], result of:
            0.016812433 = score(doc=396,freq=1.0), product of:
              0.0787257 = queryWeight, product of:
                1.3280009 = boost
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.015180715 = queryNorm
              0.2135571 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.905044 = idf(docFreq=2367, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.05957517 = weight(abstract_txt:statistically in 396) [ClassicSimilarity], result of:
            0.05957517 = score(doc=396,freq=1.0), product of:
              0.15984876 = queryWeight, product of:
                1.5450734 = boost
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.015180715 = queryNorm
              0.3726971 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8150325 = idf(docFreq=128, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.028163958 = weight(abstract_txt:performance in 396) [ClassicSimilarity], result of:
            0.028163958 = score(doc=396,freq=1.0), product of:
              0.1110432 = queryWeight, product of:
                1.5771976 = boost
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.015180715 = queryNorm
              0.25363064 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6378174 = idf(docFreq=1137, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.030975807 = weight(abstract_txt:significant in 396) [ClassicSimilarity], result of:
            0.030975807 = score(doc=396,freq=1.0), product of:
              0.11831631 = queryWeight, product of:
                1.62803 = boost
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.015180715 = queryNorm
              0.26180506 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7872925 = idf(docFreq=979, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.055467162 = weight(abstract_txt:language in 396) [ClassicSimilarity], result of:
            0.055467162 = score(doc=396,freq=4.0), product of:
              0.12097057 = queryWeight, product of:
                1.9008565 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.015180715 = queryNorm
              0.45851782 = fieldWeight in 396, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.20659237 = weight(abstract_txt:stemmer in 396) [ClassicSimilarity], result of:
            0.20659237 = score(doc=396,freq=2.0), product of:
              0.29066893 = queryWeight, product of:
                2.0835013 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.015180715 = queryNorm
              0.710748 = fieldWeight in 396, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.04646542 = weight(abstract_txt:differences in 396) [ClassicSimilarity], result of:
            0.04646542 = score(doc=396,freq=1.0), product of:
              0.17064582 = queryWeight, product of:
                2.2576535 = boost
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.015180715 = queryNorm
              0.27229157 = fieldWeight in 396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.979046 = idf(docFreq=808, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
          0.33090216 = weight(abstract_txt:stemming in 396) [ClassicSimilarity], result of:
            0.33090216 = score(doc=396,freq=2.0), product of:
              0.5738908 = queryWeight, product of:
                5.070721 = boost
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.015180715 = queryNorm
              0.5765943 = fieldWeight in 396, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4553375 = idf(docFreq=67, maxDocs=43254)
                0.0546875 = fieldNorm(doc=396)
        0.36 = coord(9/25)