Document (#27235)

Author
Díaz, I.
Ranilla, J.
Montañes, E.
Fernández, J.
Combarro, E.F.
Title
Improving performance of text categorization by combining filtering and support vector machines
Source
Journal of the American Society for Information Science and technology. 55(2004) no.7, S.579-592
Year
2004
Abstract
Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going an with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, an two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Díaz, P.: Usability of hypermedia educational e-books (2003) 1.98
    1.975418 = sum of:
      1.975418 = product of:
        3.950836 = sum of:
          3.950836 = weight(author_txt:díaz in 1198) [ClassicSimilarity], result of:
            3.950836 = score(doc=1198,freq=1.0), product of:
              0.7305907 = queryWeight, product of:
                1.0343924 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.081630796 = queryNorm
              5.407728 = fieldWeight in 1198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.625 = fieldNorm(doc=1198)
        0.5 = coord(1/2)
    
  2. Díaz, J.P. -> Pino-Díaz, J.: 1.96
    1.9555639 = sum of:
      1.9555639 = product of:
        3.9111278 = sum of:
          3.9111278 = weight(author_txt:díaz in 52) [ClassicSimilarity], result of:
            3.9111278 = score(doc=52,freq=2.0), product of:
              0.7305907 = queryWeight, product of:
                1.0343924 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.081630796 = queryNorm
              5.3533773 = fieldWeight in 52, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.4375 = fieldNorm(doc=52)
        0.5 = coord(1/2)
    
  3. Moreno Fernández, L.M. -> Fernández, L.M.M.: 1.77
    1.7669166 = sum of:
      1.7669166 = product of:
        3.5338333 = sum of:
          3.5338333 = weight(author_txt:fernández in 5951) [ClassicSimilarity], result of:
            3.5338333 = score(doc=5951,freq=2.0), product of:
              0.68281573 = queryWeight, product of:
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.081630796 = queryNorm
              5.1753836 = fieldWeight in 5951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.4375 = fieldNorm(doc=5951)
        0.5 = coord(1/2)
    
  4. Esteban, A. Díaz -> Díaz Esteban, A.: 1.68
    1.6761976 = sum of:
      1.6761976 = product of:
        3.3523953 = sum of:
          3.3523953 = weight(author_txt:díaz in 2747) [ClassicSimilarity], result of:
            3.3523953 = score(doc=2747,freq=2.0), product of:
              0.7305907 = queryWeight, product of:
                1.0343924 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.081630796 = queryNorm
              4.588609 = fieldWeight in 2747, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.375 = fieldNorm(doc=2747)
        0.5 = coord(1/2)
    
  5. Díaz, N.P. Cruz -> Cruz Díaz, N.P.: 1.68
    1.6761976 = sum of:
      1.6761976 = product of:
        3.3523953 = sum of:
          3.3523953 = weight(author_txt:díaz in 233) [ClassicSimilarity], result of:
            3.3523953 = score(doc=233,freq=2.0), product of:
              0.7305907 = queryWeight, product of:
                1.0343924 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.081630796 = queryNorm
              4.588609 = fieldWeight in 233, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.375 = fieldNorm(doc=233)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.41
    0.4141775 = sum of:
      0.4141775 = product of:
        0.9413125 = sum of:
          0.026418298 = weight(abstract_txt:results in 3386) [ClassicSimilarity], result of:
            0.026418298 = score(doc=3386,freq=1.0), product of:
              0.0809193 = queryWeight, product of:
                1.0364115 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.02242015 = queryNorm
              0.32647708 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.017653193 = weight(abstract_txt:that in 3386) [ClassicSimilarity], result of:
            0.017653193 = score(doc=3386,freq=2.0), product of:
              0.05619334 = queryWeight, product of:
                1.0577772 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.02242015 = queryNorm
              0.314151 = fieldWeight in 3386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.013183579 = weight(abstract_txt:this in 3386) [ClassicSimilarity], result of:
            0.013183579 = score(doc=3386,freq=1.0), product of:
              0.058277585 = queryWeight, product of:
                1.0772154 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.02242015 = queryNorm
              0.2262204 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.020727549 = weight(abstract_txt:with in 3386) [ClassicSimilarity], result of:
            0.020727549 = score(doc=3386,freq=2.0), product of:
              0.062541455 = queryWeight, product of:
                1.115927 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02242015 = queryNorm
              0.33142096 = fieldWeight in 3386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.041366715 = weight(abstract_txt:text in 3386) [ClassicSimilarity], result of:
            0.041366715 = score(doc=3386,freq=1.0), product of:
              0.109114625 = queryWeight, product of:
                1.2035043 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.02242015 = queryNorm
              0.37911248 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.052503884 = weight(abstract_txt:support in 3386) [ClassicSimilarity], result of:
            0.052503884 = score(doc=3386,freq=1.0), product of:
              0.12791158 = queryWeight, product of:
                1.30305 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.02242015 = queryNorm
              0.41047013 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.10433261 = weight(abstract_txt:significantly in 3386) [ClassicSimilarity], result of:
            0.10433261 = score(doc=3386,freq=1.0), product of:
              0.2021757 = queryWeight, product of:
                1.638214 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.02242015 = queryNorm
              0.5160492 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.1734406 = weight(abstract_txt:vector in 3386) [ClassicSimilarity], result of:
            0.1734406 = score(doc=3386,freq=1.0), product of:
              0.2837153 = queryWeight, product of:
                1.9406514 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.02242015 = queryNorm
              0.6113192 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.17910308 = weight(abstract_txt:categorization in 3386) [ClassicSimilarity], result of:
            0.17910308 = score(doc=3386,freq=1.0), product of:
              0.28985733 = queryWeight, product of:
                1.9615451 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.02242015 = queryNorm
              0.6179008 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.09315647 = weight(abstract_txt:performance in 3386) [ClassicSimilarity], result of:
            0.09315647 = score(doc=3386,freq=1.0), product of:
              0.21459587 = queryWeight, product of:
                2.0671046 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.02242015 = queryNorm
              0.43410188 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
          0.21942659 = weight(abstract_txt:machines in 3386) [ClassicSimilarity], result of:
            0.21942659 = score(doc=3386,freq=1.0), product of:
              0.3318754 = queryWeight, product of:
                2.0989094 = boost
                7.0524964 = idf(docFreq=103, maxDocs=44218)
                0.02242015 = queryNorm
              0.66117156 = fieldWeight in 3386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0524964 = idf(docFreq=103, maxDocs=44218)
                0.09375 = fieldNorm(doc=3386)
        0.44 = coord(11/25)
    
  2. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.32
    0.31812283 = sum of:
      0.31812283 = product of:
        0.7230064 = sum of:
          0.014413771 = weight(abstract_txt:that in 831) [ClassicSimilarity], result of:
            0.014413771 = score(doc=831,freq=3.0), product of:
              0.05619334 = queryWeight, product of:
                1.0577772 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.02242015 = queryNorm
              0.2565032 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.012429597 = weight(abstract_txt:this in 831) [ClassicSimilarity], result of:
            0.012429597 = score(doc=831,freq=2.0), product of:
              0.058277585 = queryWeight, product of:
                1.0772154 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.02242015 = queryNorm
              0.21328263 = fieldWeight in 831, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.016923973 = weight(abstract_txt:with in 831) [ClassicSimilarity], result of:
            0.016923973 = score(doc=831,freq=3.0), product of:
              0.062541455 = queryWeight, product of:
                1.115927 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02242015 = queryNorm
              0.27060407 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.061665867 = weight(abstract_txt:text in 831) [ClassicSimilarity], result of:
            0.061665867 = score(doc=831,freq=5.0), product of:
              0.109114625 = queryWeight, product of:
                1.2035043 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.02242015 = queryNorm
              0.5651476 = fieldWeight in 831, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.03500259 = weight(abstract_txt:support in 831) [ClassicSimilarity], result of:
            0.03500259 = score(doc=831,freq=1.0), product of:
              0.12791158 = queryWeight, product of:
                1.30305 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.02242015 = queryNorm
              0.27364674 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.069555074 = weight(abstract_txt:significantly in 831) [ClassicSimilarity], result of:
            0.069555074 = score(doc=831,freq=1.0), product of:
              0.2021757 = queryWeight, product of:
                1.638214 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.02242015 = queryNorm
              0.3440328 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.0858184 = weight(abstract_txt:improving in 831) [ClassicSimilarity], result of:
            0.0858184 = score(doc=831,freq=1.0), product of:
              0.23257518 = queryWeight, product of:
                1.757065 = boost
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.02242015 = queryNorm
              0.3689921 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.115627065 = weight(abstract_txt:vector in 831) [ClassicSimilarity], result of:
            0.115627065 = score(doc=831,freq=1.0), product of:
              0.2837153 = queryWeight, product of:
                1.9406514 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.02242015 = queryNorm
              0.4075461 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.12420863 = weight(abstract_txt:performance in 831) [ClassicSimilarity], result of:
            0.12420863 = score(doc=831,freq=4.0), product of:
              0.21459587 = queryWeight, product of:
                2.0671046 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.02242015 = queryNorm
              0.5788025 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.14628439 = weight(abstract_txt:machines in 831) [ClassicSimilarity], result of:
            0.14628439 = score(doc=831,freq=1.0), product of:
              0.3318754 = queryWeight, product of:
                2.0989094 = boost
                7.0524964 = idf(docFreq=103, maxDocs=44218)
                0.02242015 = queryNorm
              0.44078103 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0524964 = idf(docFreq=103, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.04107709 = weight(abstract_txt:different in 831) [ClassicSimilarity], result of:
            0.04107709 = score(doc=831,freq=1.0), product of:
              0.17930244 = queryWeight, product of:
                2.1817966 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.02242015 = queryNorm
              0.22909386 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
        0.44 = coord(11/25)
    
  3. Mostafa, J.; Quiroga, L.M.; Palakal, M.: Filtering medical documents using automated and human classification methods (1998) 0.29
    0.2850379 = sum of:
      0.2850379 = product of:
        0.8907435 = sum of:
          0.0791019 = weight(abstract_txt:improves in 2326) [ClassicSimilarity], result of:
            0.0791019 = score(doc=2326,freq=1.0), product of:
              0.15066685 = queryWeight, product of:
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.02242015 = queryNorm
              0.52501196 = fieldWeight in 2326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.031134263 = weight(abstract_txt:results in 2326) [ClassicSimilarity], result of:
            0.031134263 = score(doc=2326,freq=2.0), product of:
              0.0809193 = queryWeight, product of:
                1.0364115 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.02242015 = queryNorm
              0.38475692 = fieldWeight in 2326, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.0104022445 = weight(abstract_txt:that in 2326) [ClassicSimilarity], result of:
            0.0104022445 = score(doc=2326,freq=1.0), product of:
              0.05619334 = queryWeight, product of:
                1.0577772 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.02242015 = queryNorm
              0.18511525 = fieldWeight in 2326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.010986315 = weight(abstract_txt:this in 2326) [ClassicSimilarity], result of:
            0.010986315 = score(doc=2326,freq=1.0), product of:
              0.058277585 = queryWeight, product of:
                1.0772154 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.02242015 = queryNorm
              0.18851699 = fieldWeight in 2326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.017272959 = weight(abstract_txt:with in 2326) [ClassicSimilarity], result of:
            0.017272959 = score(doc=2326,freq=2.0), product of:
              0.062541455 = queryWeight, product of:
                1.115927 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02242015 = queryNorm
              0.27618414 = fieldWeight in 2326, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.15526079 = weight(abstract_txt:performance in 2326) [ClassicSimilarity], result of:
            0.15526079 = score(doc=2326,freq=4.0), product of:
              0.21459587 = queryWeight, product of:
                2.0671046 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.02242015 = queryNorm
              0.7235032 = fieldWeight in 2326, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.05134636 = weight(abstract_txt:different in 2326) [ClassicSimilarity], result of:
            0.05134636 = score(doc=2326,freq=1.0), product of:
              0.17930244 = queryWeight, product of:
                2.1817966 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.02242015 = queryNorm
              0.28636733 = fieldWeight in 2326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
          0.5352386 = weight(abstract_txt:filtering in 2326) [ClassicSimilarity], result of:
            0.5352386 = score(doc=2326,freq=6.0), product of:
              0.42780715 = queryWeight, product of:
                2.9186082 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.02242015 = queryNorm
              1.2511213 = fieldWeight in 2326, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.078125 = fieldNorm(doc=2326)
        0.32 = coord(8/25)
    
  4. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.26
    0.25645638 = sum of:
      0.25645638 = product of:
        0.7123788 = sum of:
          0.02490741 = weight(abstract_txt:results in 2579) [ClassicSimilarity], result of:
            0.02490741 = score(doc=2579,freq=2.0), product of:
              0.0809193 = queryWeight, product of:
                1.0364115 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.02242015 = queryNorm
              0.30780554 = fieldWeight in 2579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.016643591 = weight(abstract_txt:that in 2579) [ClassicSimilarity], result of:
            0.016643591 = score(doc=2579,freq=4.0), product of:
              0.05619334 = queryWeight, product of:
                1.0577772 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.02242015 = queryNorm
              0.2961844 = fieldWeight in 2579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.008789052 = weight(abstract_txt:this in 2579) [ClassicSimilarity], result of:
            0.008789052 = score(doc=2579,freq=1.0), product of:
              0.058277585 = queryWeight, product of:
                1.0772154 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.02242015 = queryNorm
              0.1508136 = fieldWeight in 2579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.00977106 = weight(abstract_txt:with in 2579) [ClassicSimilarity], result of:
            0.00977106 = score(doc=2579,freq=1.0), product of:
              0.062541455 = queryWeight, product of:
                1.115927 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02242015 = queryNorm
              0.15623334 = fieldWeight in 2579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.05079578 = weight(abstract_txt:three in 2579) [ClassicSimilarity], result of:
            0.05079578 = score(doc=2579,freq=2.0), product of:
              0.13013223 = queryWeight, product of:
                1.3143123 = boost
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.02242015 = queryNorm
              0.39033973 = fieldWeight in 2579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.069555074 = weight(abstract_txt:significantly in 2579) [ClassicSimilarity], result of:
            0.069555074 = score(doc=2579,freq=1.0), product of:
              0.2021757 = queryWeight, product of:
                1.638214 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.02242015 = queryNorm
              0.3440328 = fieldWeight in 2579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.12420863 = weight(abstract_txt:performance in 2579) [ClassicSimilarity], result of:
            0.12420863 = score(doc=2579,freq=4.0), product of:
              0.21459587 = queryWeight, product of:
                2.0671046 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.02242015 = queryNorm
              0.5788025 = fieldWeight in 2579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.058091775 = weight(abstract_txt:different in 2579) [ClassicSimilarity], result of:
            0.058091775 = score(doc=2579,freq=2.0), product of:
              0.17930244 = queryWeight, product of:
                2.1817966 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.02242015 = queryNorm
              0.32398763 = fieldWeight in 2579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.3496164 = weight(abstract_txt:filtering in 2579) [ClassicSimilarity], result of:
            0.3496164 = score(doc=2579,freq=4.0), product of:
              0.42780715 = queryWeight, product of:
                2.9186082 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.02242015 = queryNorm
              0.817229 = fieldWeight in 2579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
        0.36 = coord(9/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.24
    0.24482948 = sum of:
      0.24482948 = product of:
        0.68008184 = sum of:
          0.094922274 = weight(abstract_txt:improves in 1595) [ClassicSimilarity], result of:
            0.094922274 = score(doc=1595,freq=1.0), product of:
              0.15066685 = queryWeight, product of:
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.02242015 = queryNorm
              0.63001436 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.026418298 = weight(abstract_txt:results in 1595) [ClassicSimilarity], result of:
            0.026418298 = score(doc=1595,freq=1.0), product of:
              0.0809193 = queryWeight, product of:
                1.0364115 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.02242015 = queryNorm
              0.32647708 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.021620655 = weight(abstract_txt:that in 1595) [ClassicSimilarity], result of:
            0.021620655 = score(doc=1595,freq=3.0), product of:
              0.05619334 = queryWeight, product of:
                1.0577772 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.02242015 = queryNorm
              0.38475478 = fieldWeight in 1595, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.013183579 = weight(abstract_txt:this in 1595) [ClassicSimilarity], result of:
            0.013183579 = score(doc=1595,freq=1.0), product of:
              0.058277585 = queryWeight, product of:
                1.0772154 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.02242015 = queryNorm
              0.2262204 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.01465659 = weight(abstract_txt:with in 1595) [ClassicSimilarity], result of:
            0.01465659 = score(doc=1595,freq=1.0), product of:
              0.062541455 = queryWeight, product of:
                1.115927 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02242015 = queryNorm
              0.23435001 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.058501374 = weight(abstract_txt:text in 1595) [ClassicSimilarity], result of:
            0.058501374 = score(doc=1595,freq=2.0), product of:
              0.109114625 = queryWeight, product of:
                1.2035043 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.02242015 = queryNorm
              0.53614604 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.10433261 = weight(abstract_txt:significantly in 1595) [ClassicSimilarity], result of:
            0.10433261 = score(doc=1595,freq=1.0), product of:
              0.2021757 = queryWeight, product of:
                1.638214 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.02242015 = queryNorm
              0.5160492 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.25329 = weight(abstract_txt:categorization in 1595) [ClassicSimilarity], result of:
            0.25329 = score(doc=1595,freq=2.0), product of:
              0.28985733 = queryWeight, product of:
                1.9615451 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.02242015 = queryNorm
              0.87384367 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.09315647 = weight(abstract_txt:performance in 1595) [ClassicSimilarity], result of:
            0.09315647 = score(doc=1595,freq=1.0), product of:
              0.21459587 = queryWeight, product of:
                2.0671046 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.02242015 = queryNorm
              0.43410188 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
        0.36 = coord(9/25)