Document (#14869)

Author
Paice, C.D.
Title
Method for evaluation of stemming algorithms based on error counting
Source
Journal of the American Society for Information Science. 47(1996) no.8, S.632-649
Year
1996
Abstract
Assesses the effectiveness of stemming algorithms by counting the number of identifiable errors during the stemming of words from various text samples. This entails manual groupings of the words in each sample using software developed for this purpose, stemming the words and computing indeices which represent the rate of understemming and overstemming. Presents the results for 3 stemmers (Lovins, Porter, and Paice/Husk), in each case using 3 text samples
Theme
Computerlinguistik

Similar documents (author)

  1. Paice, C.D.: Expert systems for information retrieval? (1986) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:paice in 1101) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 1101, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=1101)
    
  2. Paice, C.D.: ¬A thesaural model of information retrieval (1991) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:paice in 2294) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 2294, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=2294)
    
  3. Paice, C.D.: Automatic abstracting (1994) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:paice in 917) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 917, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=917)
    
  4. Paice, C.D.: Automatic abstracting (1994) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:paice in 1255) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 1255, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=1255)
    
  5. Paice, C.D.: Soft evaluation of Boolean search queries in information retrieval systems (1984) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:paice in 789) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 789, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=789)
    

Similar documents (content)

  1. Kraaij, W.; Pohlmann, R.: Evaluation of a Dutch stemming algorithm (1995) 0.33
    0.332727 = sum of:
      0.332727 = product of:
        1.1883106 = sum of:
          0.017951056 = weight(abstract_txt:using in 5798) [ClassicSimilarity], result of:
            0.017951056 = score(doc=5798,freq=1.0), product of:
              0.066348724 = queryWeight, product of:
                1.4407761 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.013297461 = queryNorm
              0.27055615 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.028581062 = weight(abstract_txt:text in 5798) [ClassicSimilarity], result of:
            0.028581062 = score(doc=5798,freq=1.0), product of:
              0.090467274 = queryWeight, product of:
                1.6823872 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013297461 = queryNorm
              0.3159271 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.04270676 = weight(abstract_txt:each in 5798) [ClassicSimilarity], result of:
            0.04270676 = score(doc=5798,freq=2.0), product of:
              0.09384844 = queryWeight, product of:
                1.7135379 = boost
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.013297461 = queryNorm
              0.45506096 = fieldWeight in 5798, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.1605944 = weight(abstract_txt:stemmers in 5798) [ClassicSimilarity], result of:
            0.1605944 = score(doc=5798,freq=1.0), product of:
              0.22694269 = queryWeight, product of:
                1.884184 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.013297461 = queryNorm
              0.707643 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.1791643 = weight(abstract_txt:porter in 5798) [ClassicSimilarity], result of:
            0.1791643 = score(doc=5798,freq=1.0), product of:
              0.24411638 = queryWeight, product of:
                1.9541761 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.013297461 = queryNorm
              0.7339299 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.1406331 = weight(abstract_txt:words in 5798) [ClassicSimilarity], result of:
            0.1406331 = score(doc=5798,freq=2.0), product of:
              0.23778515 = queryWeight, product of:
                3.3405519 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.013297461 = queryNorm
              0.5914293 = fieldWeight in 5798, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.61868 = weight(abstract_txt:stemming in 5798) [ClassicSimilarity], result of:
            0.61868 = score(doc=5798,freq=3.0), product of:
              0.61383677 = queryWeight, product of:
                6.197573 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.013297461 = queryNorm
              1.0078901 = fieldWeight in 5798, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
        0.28 = coord(7/25)
    
  2. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.26
    0.260195 = sum of:
      0.260195 = product of:
        1.0841458 = sum of:
          0.033611543 = weight(abstract_txt:case in 2585) [ClassicSimilarity], result of:
            0.033611543 = score(doc=2585,freq=1.0), product of:
              0.06392483 = queryWeight, product of:
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.013297461 = queryNorm
              0.52579796 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.008501535 = weight(abstract_txt:this in 2585) [ClassicSimilarity], result of:
            0.008501535 = score(doc=2585,freq=1.0), product of:
              0.032212082 = queryWeight, product of:
                1.003898 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.013297461 = queryNorm
              0.2639238 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.040013485 = weight(abstract_txt:text in 2585) [ClassicSimilarity], result of:
            0.040013485 = score(doc=2585,freq=1.0), product of:
              0.090467274 = queryWeight, product of:
                1.6823872 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013297461 = queryNorm
              0.4422979 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.38942072 = weight(abstract_txt:stemmers in 2585) [ClassicSimilarity], result of:
            0.38942072 = score(doc=2585,freq=3.0), product of:
              0.22694269 = queryWeight, product of:
                1.884184 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.013297461 = queryNorm
              1.715943 = fieldWeight in 2585, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.112525485 = weight(abstract_txt:algorithms in 2585) [ClassicSimilarity], result of:
            0.112525485 = score(doc=2585,freq=1.0), product of:
              0.18024138 = queryWeight, product of:
                2.374693 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013297461 = queryNorm
              0.6243044 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
          0.5000731 = weight(abstract_txt:stemming in 2585) [ClassicSimilarity], result of:
            0.5000731 = score(doc=2585,freq=1.0), product of:
              0.61383677 = queryWeight, product of:
                6.197573 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.013297461 = queryNorm
              0.8146679 = fieldWeight in 2585, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.109375 = fieldNorm(doc=2585)
        0.24 = coord(6/25)
    
  3. Frakes, W.B.: Stemming algorithms (1992) 0.26
    0.25665754 = sum of:
      0.25665754 = product of:
        1.6041096 = sum of:
          0.045822 = weight(abstract_txt:effectiveness in 3503) [ClassicSimilarity], result of:
            0.045822 = score(doc=3503,freq=1.0), product of:
              0.07190051 = queryWeight, product of:
                1.0605501 = boost
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.013297461 = queryNorm
              0.6372973 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.28666288 = weight(abstract_txt:porter in 3503) [ClassicSimilarity], result of:
            0.28666288 = score(doc=3503,freq=1.0), product of:
              0.24411638 = queryWeight, product of:
                1.9541761 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.013297461 = queryNorm
              1.1742878 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.12860055 = weight(abstract_txt:algorithms in 3503) [ClassicSimilarity], result of:
            0.12860055 = score(doc=3503,freq=1.0), product of:
              0.18024138 = queryWeight, product of:
                2.374693 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013297461 = queryNorm
              0.7134907 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          1.1430242 = weight(abstract_txt:stemming in 3503) [ClassicSimilarity], result of:
            1.1430242 = score(doc=3503,freq=4.0), product of:
              0.61383677 = queryWeight, product of:
                6.197573 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.013297461 = queryNorm
              1.862098 = fieldWeight in 3503, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
        0.16 = coord(4/25)
    
  4. Flores, F.N.; Moreira, V.P.: Assessing the impact of stemming accuracy on information retrieval : a multilingual perspective (2016) 0.24
    0.23978713 = sum of:
      0.23978713 = product of:
        0.9991131 = sum of:
          0.024008248 = weight(abstract_txt:case in 3187) [ClassicSimilarity], result of:
            0.024008248 = score(doc=3187,freq=1.0), product of:
              0.06392483 = queryWeight, product of:
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.013297461 = queryNorm
              0.37557 = fieldWeight in 3187, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
          0.008587847 = weight(abstract_txt:this in 3187) [ClassicSimilarity], result of:
            0.008587847 = score(doc=3187,freq=2.0), product of:
              0.032212082 = queryWeight, product of:
                1.003898 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.013297461 = queryNorm
              0.2666033 = fieldWeight in 3187, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
          0.06954135 = weight(abstract_txt:error in 3187) [ClassicSimilarity], result of:
            0.06954135 = score(doc=3187,freq=1.0), product of:
              0.12989467 = queryWeight, product of:
                1.4254792 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.013297461 = queryNorm
              0.5353672 = fieldWeight in 3187, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
          0.27815765 = weight(abstract_txt:stemmers in 3187) [ClassicSimilarity], result of:
            0.27815765 = score(doc=3187,freq=3.0), product of:
              0.22694269 = queryWeight, product of:
                1.884184 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.013297461 = queryNorm
              1.2256736 = fieldWeight in 3187, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
          0.1136679 = weight(abstract_txt:algorithms in 3187) [ClassicSimilarity], result of:
            0.1136679 = score(doc=3187,freq=2.0), product of:
              0.18024138 = queryWeight, product of:
                2.374693 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013297461 = queryNorm
              0.63064265 = fieldWeight in 3187, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
          0.5051501 = weight(abstract_txt:stemming in 3187) [ClassicSimilarity], result of:
            0.5051501 = score(doc=3187,freq=2.0), product of:
              0.61383677 = queryWeight, product of:
                6.197573 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.013297461 = queryNorm
              0.8229388 = fieldWeight in 3187, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=3187)
        0.24 = coord(6/25)
    
  5. Duwairi, R.; Al-Refai, M.N.; Khasawneh, N.: Feature reduction techniques for Arabic text categorization (2009) 0.19
    0.19124702 = sum of:
      0.19124702 = product of:
        0.7968626 = sum of:
          0.0048580198 = weight(abstract_txt:this in 3169) [ClassicSimilarity], result of:
            0.0048580198 = score(doc=3169,freq=1.0), product of:
              0.032212082 = queryWeight, product of:
                1.003898 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.013297461 = queryNorm
              0.1508136 = fieldWeight in 3169, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
          0.014360843 = weight(abstract_txt:using in 3169) [ClassicSimilarity], result of:
            0.014360843 = score(doc=3169,freq=1.0), product of:
              0.066348724 = queryWeight, product of:
                1.4407761 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.013297461 = queryNorm
              0.21644491 = fieldWeight in 3169, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
          0.02286485 = weight(abstract_txt:text in 3169) [ClassicSimilarity], result of:
            0.02286485 = score(doc=3169,freq=1.0), product of:
              0.090467274 = queryWeight, product of:
                1.6823872 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013297461 = queryNorm
              0.25274166 = fieldWeight in 3169, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
          0.024158593 = weight(abstract_txt:each in 3169) [ClassicSimilarity], result of:
            0.024158593 = score(doc=3169,freq=1.0), product of:
              0.09384844 = queryWeight, product of:
                1.7135379 = boost
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.013297461 = queryNorm
              0.25742137 = fieldWeight in 3169, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
          0.15910819 = weight(abstract_txt:words in 3169) [ClassicSimilarity], result of:
            0.15910819 = score(doc=3169,freq=4.0), product of:
              0.23778515 = queryWeight, product of:
                3.3405519 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.013297461 = queryNorm
              0.66912585 = fieldWeight in 3169, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
          0.5715121 = weight(abstract_txt:stemming in 3169) [ClassicSimilarity], result of:
            0.5715121 = score(doc=3169,freq=4.0), product of:
              0.61383677 = queryWeight, product of:
                6.197573 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.013297461 = queryNorm
              0.931049 = fieldWeight in 3169, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.0625 = fieldNorm(doc=3169)
        0.24 = coord(6/25)