Document (#35300)

Author
Dolamic, L.
Savoy, J.
Title
Indexing and searching strategies for the Russian language
Source
Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2540-2547
Year
2009
Abstract
This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
Theme
Automatisches Indexieren

Similar documents (author)

  1. Savoy, J.: Stemming of French words based on grammatical categories (1993) 5.20
    5.1965666 = sum of:
      5.1965666 = weight(author_txt:savoy in 4647) [ClassicSimilarity], result of:
        5.1965666 = fieldWeight in 4647, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.314507 = idf(docFreq=28, maxDocs=43556)
          0.625 = fieldNorm(doc=4647)
    
  2. Savoy, J.: Effectiveness of information retrieval systems used in a hypertext environment (1993) 5.20
    5.1965666 = sum of:
      5.1965666 = weight(author_txt:savoy in 6508) [ClassicSimilarity], result of:
        5.1965666 = fieldWeight in 6508, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.314507 = idf(docFreq=28, maxDocs=43556)
          0.625 = fieldNorm(doc=6508)
    
  3. Savoy, J.: ¬A learning scheme for information retrieval in hypertext (1994) 5.20
    5.1965666 = sum of:
      5.1965666 = weight(author_txt:savoy in 7289) [ClassicSimilarity], result of:
        5.1965666 = fieldWeight in 7289, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.314507 = idf(docFreq=28, maxDocs=43556)
          0.625 = fieldNorm(doc=7289)
    
  4. Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 5.20
    5.1965666 = sum of:
      5.1965666 = weight(author_txt:savoy in 258) [ClassicSimilarity], result of:
        5.1965666 = fieldWeight in 258, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.314507 = idf(docFreq=28, maxDocs=43556)
          0.625 = fieldNorm(doc=258)
    
  5. Savoy, J.: Searching information in legal hypertext systems (1993/94) 5.20
    5.1965666 = sum of:
      5.1965666 = weight(author_txt:savoy in 823) [ClassicSimilarity], result of:
        5.1965666 = fieldWeight in 823, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.314507 = idf(docFreq=28, maxDocs=43556)
          0.625 = fieldNorm(doc=823)
    

Similar documents (content)

  1. Savoy, J.: Searching strategies for the Hungarian language (2008) 1.19
    1.1889482 = sum of:
      1.1889482 = product of:
        1.8577316 = sum of:
          0.04230253 = weight(abstract_txt:space in 4035) [ClassicSimilarity], result of:
            0.04230253 = score(doc=4035,freq=1.0), product of:
              0.10031598 = queryWeight, product of:
                1.2250894 = boost
                5.3976684 = idf(docFreq=535, maxDocs=43556)
                0.015170368 = queryNorm
              0.42169285 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3976684 = idf(docFreq=535, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.03705102 = weight(abstract_txt:approach in 4035) [ClassicSimilarity], result of:
            0.03705102 = score(doc=4035,freq=3.0), product of:
              0.07288688 = queryWeight, product of:
                1.2789484 = boost
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015170368 = queryNorm
              0.50833595 = fieldWeight in 4035, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.023916475 = weight(abstract_txt:than in 4035) [ClassicSimilarity], result of:
            0.023916475 = score(doc=4035,freq=1.0), product of:
              0.07851533 = queryWeight, product of:
                1.3274115 = boost
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.015170368 = queryNorm
              0.304609 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.07445507 = weight(abstract_txt:vector in 4035) [ClassicSimilarity], result of:
            0.07445507 = score(doc=4035,freq=1.0), product of:
              0.14623637 = queryWeight, product of:
                1.479144 = boost
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.015170368 = queryNorm
              0.5091419 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.08540456 = weight(abstract_txt:statistically in 4035) [ClassicSimilarity], result of:
            0.08540456 = score(doc=4035,freq=1.0), product of:
              0.16024332 = queryWeight, product of:
                1.5483627 = boost
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.015170368 = queryNorm
              0.532968 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.040227357 = weight(abstract_txt:performance in 4035) [ClassicSimilarity], result of:
            0.040227357 = score(doc=4035,freq=1.0), product of:
              0.11104627 = queryWeight, product of:
                1.57863 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.015170368 = queryNorm
              0.36225763 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.04113443 = weight(abstract_txt:models in 4035) [ClassicSimilarity], result of:
            0.04113443 = score(doc=4035,freq=1.0), product of:
              0.11270935 = queryWeight, product of:
                1.5904073 = boost
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.015170368 = queryNorm
              0.3649602 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.044181366 = weight(abstract_txt:significant in 4035) [ClassicSimilarity], result of:
            0.044181366 = score(doc=4035,freq=1.0), product of:
              0.118208595 = queryWeight, product of:
                1.6287442 = boost
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.015170368 = queryNorm
              0.37375763 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.07726445 = weight(abstract_txt:strategies in 4035) [ClassicSimilarity], result of:
            0.07726445 = score(doc=4035,freq=2.0), product of:
              0.13618611 = queryWeight, product of:
                1.7482147 = boost
                5.1350174 = idf(docFreq=696, maxDocs=43556)
                0.015170368 = queryNorm
              0.5673446 = fieldWeight in 4035, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1350174 = idf(docFreq=696, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.13736139 = weight(abstract_txt:okapi in 4035) [ClassicSimilarity], result of:
            0.13736139 = score(doc=4035,freq=1.0), product of:
              0.21997282 = queryWeight, product of:
                1.8141252 = boost
                7.9929233 = idf(docFreq=39, maxDocs=43556)
                0.015170368 = queryNorm
              0.6244471 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9929233 = idf(docFreq=39, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.06855346 = weight(abstract_txt:language in 4035) [ClassicSimilarity], result of:
            0.06855346 = score(doc=4035,freq=3.0), product of:
              0.12090615 = queryWeight, product of:
                1.90205 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.015170368 = queryNorm
              0.5669973 = fieldWeight in 4035, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.120303355 = weight(abstract_txt:light in 4035) [ClassicSimilarity], result of:
            0.120303355 = score(doc=4035,freq=2.0), product of:
              0.18294962 = queryWeight, product of:
                2.0262551 = boost
                5.951703 = idf(docFreq=307, maxDocs=43556)
                0.015170368 = queryNorm
              0.65757644 = fieldWeight in 4035, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.951703 = idf(docFreq=307, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.20925298 = weight(abstract_txt:stemmer in 4035) [ClassicSimilarity], result of:
            0.20925298 = score(doc=4035,freq=1.0), product of:
              0.29123282 = queryWeight, product of:
                2.0873866 = boost
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.015170368 = queryNorm
              0.7185075 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.20925298 = weight(abstract_txt:aggressive in 4035) [ClassicSimilarity], result of:
            0.20925298 = score(doc=4035,freq=1.0), product of:
              0.29123282 = queryWeight, product of:
                2.0873866 = boost
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.015170368 = queryNorm
              0.7185075 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.06624322 = weight(abstract_txt:differences in 4035) [ClassicSimilarity], result of:
            0.06624322 = score(doc=4035,freq=1.0), product of:
              0.17043684 = queryWeight, product of:
                2.2582889 = boost
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.015170368 = queryNorm
              0.38866723 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
          0.58082706 = weight(abstract_txt:stemming in 4035) [ClassicSimilarity], result of:
            0.58082706 = score(doc=4035,freq=3.0), product of:
              0.5752065 = queryWeight, product of:
                5.081071 = boost
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.015170368 = queryNorm
              1.0097713 = fieldWeight in 4035, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.078125 = fieldNorm(doc=4035)
        0.64 = coord(16/25)
    
  2. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.52
    0.5236177 = sum of:
      0.5236177 = product of:
        1.3090441 = sum of:
          0.039849564 = weight(abstract_txt:various in 12) [ClassicSimilarity], result of:
            0.039849564 = score(doc=12,freq=3.0), product of:
              0.0668397 = queryWeight, product of:
                4.405938 = idf(docFreq=1444, maxDocs=43556)
                0.015170368 = queryNorm
              0.596196 = fieldWeight in 12, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.405938 = idf(docFreq=1444, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.03997946 = weight(abstract_txt:approaches in 12) [ClassicSimilarity], result of:
            0.03997946 = score(doc=12,freq=1.0), product of:
              0.11058959 = queryWeight, product of:
                1.5753806 = boost
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.015170368 = queryNorm
              0.36151198 = fieldWeight in 12, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.05689007 = weight(abstract_txt:performance in 12) [ClassicSimilarity], result of:
            0.05689007 = score(doc=12,freq=2.0), product of:
              0.11104627 = queryWeight, product of:
                1.57863 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.015170368 = queryNorm
              0.5123096 = fieldWeight in 12, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.04113443 = weight(abstract_txt:models in 12) [ClassicSimilarity], result of:
            0.04113443 = score(doc=12,freq=1.0), product of:
              0.11270935 = queryWeight, product of:
                1.5904073 = boost
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.015170368 = queryNorm
              0.3649602 = fieldWeight in 12, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.044181366 = weight(abstract_txt:significant in 12) [ClassicSimilarity], result of:
            0.044181366 = score(doc=12,freq=1.0), product of:
              0.118208595 = queryWeight, product of:
                1.6287442 = boost
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.015170368 = queryNorm
              0.37375763 = fieldWeight in 12, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.055973668 = weight(abstract_txt:language in 12) [ClassicSimilarity], result of:
            0.055973668 = score(doc=12,freq=2.0), product of:
              0.12090615 = queryWeight, product of:
                1.90205 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.015170368 = queryNorm
              0.46295136 = fieldWeight in 12, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.28129607 = weight(abstract_txt:stemmers in 12) [ClassicSimilarity], result of:
            0.28129607 = score(doc=12,freq=2.0), product of:
              0.28155184 = queryWeight, product of:
                2.0523996 = boost
                9.042746 = idf(docFreq=13, maxDocs=43556)
                0.015170368 = queryNorm
              0.9990916 = fieldWeight in 12, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.042746 = idf(docFreq=13, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.20925298 = weight(abstract_txt:stemmer in 12) [ClassicSimilarity], result of:
            0.20925298 = score(doc=12,freq=1.0), product of:
              0.29123282 = queryWeight, product of:
                2.0873866 = boost
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.015170368 = queryNorm
              0.7185075 = fieldWeight in 12, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.06624322 = weight(abstract_txt:differences in 12) [ClassicSimilarity], result of:
            0.06624322 = score(doc=12,freq=1.0), product of:
              0.17043684 = queryWeight, product of:
                2.2582889 = boost
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.015170368 = queryNorm
              0.38866723 = fieldWeight in 12, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
          0.47424334 = weight(abstract_txt:stemming in 12) [ClassicSimilarity], result of:
            0.47424334 = score(doc=12,freq=2.0), product of:
              0.5752065 = queryWeight, product of:
                5.081071 = boost
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.015170368 = queryNorm
              0.82447493 = fieldWeight in 12, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.078125 = fieldNorm(doc=12)
        0.4 = coord(10/25)
    
  3. Dolamic, L.; Savoy, J.: When stopword lists make the difference (2009) 0.37
    0.3713693 = sum of:
      0.3713693 = product of:
        0.77368605 = sum of:
          0.10762428 = weight(abstract_txt:randomness in 317) [ClassicSimilarity], result of:
            0.10762428 = score(doc=317,freq=1.0), product of:
              0.1483848 = queryWeight, product of:
                1.0535676 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.015170368 = queryNorm
              0.7253053 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.035882954 = weight(abstract_txt:result in 317) [ClassicSimilarity], result of:
            0.035882954 = score(doc=317,freq=1.0), product of:
              0.089891374 = queryWeight, product of:
                1.1596895 = boost
                5.1095204 = idf(docFreq=714, maxDocs=43556)
                0.015170368 = queryNorm
              0.39918128 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1095204 = idf(docFreq=714, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.04111658 = weight(abstract_txt:evaluate in 317) [ClassicSimilarity], result of:
            0.04111658 = score(doc=317,freq=1.0), product of:
              0.098432206 = queryWeight, product of:
                1.2135323 = boost
                5.3467484 = idf(docFreq=563, maxDocs=43556)
                0.015170368 = queryNorm
              0.41771472 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3467484 = idf(docFreq=563, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.021391416 = weight(abstract_txt:approach in 317) [ClassicSimilarity], result of:
            0.021391416 = score(doc=317,freq=1.0), product of:
              0.07288688 = queryWeight, product of:
                1.2789484 = boost
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015170368 = queryNorm
              0.2934879 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.08540456 = weight(abstract_txt:statistically in 317) [ClassicSimilarity], result of:
            0.08540456 = score(doc=317,freq=1.0), product of:
              0.16024332 = queryWeight, product of:
                1.5483627 = boost
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.015170368 = queryNorm
              0.532968 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.03997946 = weight(abstract_txt:approaches in 317) [ClassicSimilarity], result of:
            0.03997946 = score(doc=317,freq=1.0), product of:
              0.11058959 = queryWeight, product of:
                1.5753806 = boost
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.015170368 = queryNorm
              0.36151198 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.05689007 = weight(abstract_txt:performance in 317) [ClassicSimilarity], result of:
            0.05689007 = score(doc=317,freq=2.0), product of:
              0.11104627 = queryWeight, product of:
                1.57863 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.015170368 = queryNorm
              0.5123096 = fieldWeight in 317, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.04113443 = weight(abstract_txt:models in 317) [ClassicSimilarity], result of:
            0.04113443 = score(doc=317,freq=1.0), product of:
              0.11270935 = queryWeight, product of:
                1.5904073 = boost
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.015170368 = queryNorm
              0.3649602 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6714907 = idf(docFreq=1107, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.044181366 = weight(abstract_txt:significant in 317) [ClassicSimilarity], result of:
            0.044181366 = score(doc=317,freq=1.0), product of:
              0.118208595 = queryWeight, product of:
                1.6287442 = boost
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.015170368 = queryNorm
              0.37375763 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.19425835 = weight(abstract_txt:okapi in 317) [ClassicSimilarity], result of:
            0.19425835 = score(doc=317,freq=2.0), product of:
              0.21997282 = queryWeight, product of:
                1.8141252 = boost
                7.9929233 = idf(docFreq=39, maxDocs=43556)
                0.015170368 = queryNorm
              0.8831016 = fieldWeight in 317, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.9929233 = idf(docFreq=39, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.039579358 = weight(abstract_txt:language in 317) [ClassicSimilarity], result of:
            0.039579358 = score(doc=317,freq=1.0), product of:
              0.12090615 = queryWeight, product of:
                1.90205 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.015170368 = queryNorm
              0.32735604 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
          0.06624322 = weight(abstract_txt:differences in 317) [ClassicSimilarity], result of:
            0.06624322 = score(doc=317,freq=1.0), product of:
              0.17043684 = queryWeight, product of:
                2.2582889 = boost
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.015170368 = queryNorm
              0.38866723 = fieldWeight in 317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.078125 = fieldNorm(doc=317)
        0.48 = coord(12/25)
    
  4. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.34
    0.34210706 = sum of:
      0.34210706 = product of:
        1.7105353 = sum of:
          0.047352206 = weight(abstract_txt:than in 3583) [ClassicSimilarity], result of:
            0.047352206 = score(doc=3583,freq=2.0), product of:
              0.07851533 = queryWeight, product of:
                1.3274115 = boost
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.015170368 = queryNorm
              0.60309505 = fieldWeight in 3583, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.109375 = fieldNorm(doc=3583)
          0.056318298 = weight(abstract_txt:performance in 3583) [ClassicSimilarity], result of:
            0.056318298 = score(doc=3583,freq=1.0), product of:
              0.11104627 = queryWeight, product of:
                1.57863 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.015170368 = queryNorm
              0.50716066 = fieldWeight in 3583, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.109375 = fieldNorm(doc=3583)
          0.48232234 = weight(abstract_txt:stemmers in 3583) [ClassicSimilarity], result of:
            0.48232234 = score(doc=3583,freq=3.0), product of:
              0.28155184 = queryWeight, product of:
                2.0523996 = boost
                9.042746 = idf(docFreq=13, maxDocs=43556)
                0.015170368 = queryNorm
              1.7130854 = fieldWeight in 3583, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.042746 = idf(docFreq=13, maxDocs=43556)
                0.109375 = fieldNorm(doc=3583)
          0.6550655 = weight(abstract_txt:stemmer in 3583) [ClassicSimilarity], result of:
            0.6550655 = score(doc=3583,freq=5.0), product of:
              0.29123282 = queryWeight, product of:
                2.0873866 = boost
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.015170368 = queryNorm
              2.2492845 = fieldWeight in 3583, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.109375 = fieldNorm(doc=3583)
          0.46947697 = weight(abstract_txt:stemming in 3583) [ClassicSimilarity], result of:
            0.46947697 = score(doc=3583,freq=1.0), product of:
              0.5752065 = queryWeight, product of:
                5.081071 = boost
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.015170368 = queryNorm
              0.8161885 = fieldWeight in 3583, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.109375 = fieldNorm(doc=3583)
        0.2 = coord(5/25)
    
  5. Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.28
    0.28493512 = sum of:
      0.28493512 = product of:
        0.7914864 = sum of:
          0.014973992 = weight(abstract_txt:approach in 393) [ClassicSimilarity], result of:
            0.014973992 = score(doc=393,freq=1.0), product of:
              0.07288688 = queryWeight, product of:
                1.2789484 = boost
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015170368 = queryNorm
              0.20544153 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.016741535 = weight(abstract_txt:than in 393) [ClassicSimilarity], result of:
            0.016741535 = score(doc=393,freq=1.0), product of:
              0.07851533 = queryWeight, product of:
                1.3274115 = boost
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.015170368 = queryNorm
              0.21322632 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8989954 = idf(docFreq=2398, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.059783194 = weight(abstract_txt:statistically in 393) [ClassicSimilarity], result of:
            0.059783194 = score(doc=393,freq=1.0), product of:
              0.16024332 = queryWeight, product of:
                1.5483627 = boost
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.015170368 = queryNorm
              0.3730776 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8219905 = idf(docFreq=128, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.028159149 = weight(abstract_txt:performance in 393) [ClassicSimilarity], result of:
            0.028159149 = score(doc=393,freq=1.0), product of:
              0.11104627 = queryWeight, product of:
                1.57863 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.015170368 = queryNorm
              0.25358033 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.030926956 = weight(abstract_txt:significant in 393) [ClassicSimilarity], result of:
            0.030926956 = score(doc=393,freq=1.0), product of:
              0.118208595 = queryWeight, product of:
                1.6287442 = boost
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.015170368 = queryNorm
              0.26163036 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7840977 = idf(docFreq=989, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.055411104 = weight(abstract_txt:language in 393) [ClassicSimilarity], result of:
            0.055411104 = score(doc=393,freq=4.0), product of:
              0.12090615 = queryWeight, product of:
                1.90205 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.015170368 = queryNorm
              0.45829847 = fieldWeight in 393, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.20714986 = weight(abstract_txt:stemmer in 393) [ClassicSimilarity], result of:
            0.20714986 = score(doc=393,freq=2.0), product of:
              0.29123282 = queryWeight, product of:
                2.0873866 = boost
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.015170368 = queryNorm
              0.7112861 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.196897 = idf(docFreq=11, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.04637025 = weight(abstract_txt:differences in 393) [ClassicSimilarity], result of:
            0.04637025 = score(doc=393,freq=1.0), product of:
              0.17043684 = queryWeight, product of:
                2.2582889 = boost
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.015170368 = queryNorm
              0.27206704 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9749403 = idf(docFreq=817, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
          0.33197036 = weight(abstract_txt:stemming in 393) [ClassicSimilarity], result of:
            0.33197036 = score(doc=393,freq=2.0), product of:
              0.5752065 = queryWeight, product of:
                5.081071 = boost
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.015170368 = queryNorm
              0.57713246 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.462295 = idf(docFreq=67, maxDocs=43556)
                0.0546875 = fieldNorm(doc=393)
        0.36 = coord(9/25)