Document (#27236)

Author
Díaz, I.
Ranilla, J.
Montañes, E.
Fernández, J.
Combarro, E.F.
Title
Improving performance of text categorization by combining filtering and support vector machines
Source
Journal of the American Society for Information Science and technology. 55(2004) no.7, S.579-592
Year
2004
Abstract
Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going an with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, an two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Díaz, P.: Usability of hypermedia educational e-books (2003) 1.97
    1.9671593 = sum of:
      1.9671593 = product of:
        3.9343185 = sum of:
          3.9343185 = weight(author_txt:díaz in 2378) [ClassicSimilarity], result of:
            3.9343185 = score(doc=2378,freq=1.0), product of:
              0.7306923 = queryWeight, product of:
                1.0345467 = boost
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.08198407 = queryNorm
              5.384371 = fieldWeight in 2378, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.625 = fieldNorm(doc=2378)
        0.5 = coord(1/2)
    
  2. Díaz, J.P. -> Pino-Díaz, J.: 1.95
    1.9473882 = sum of:
      1.9473882 = product of:
        3.8947763 = sum of:
          3.8947763 = weight(author_txt:díaz in 1053) [ClassicSimilarity], result of:
            3.8947763 = score(doc=1053,freq=2.0), product of:
              0.7306923 = queryWeight, product of:
                1.0345467 = boost
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.08198407 = queryNorm
              5.330255 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.4375 = fieldNorm(doc=1053)
        0.5 = coord(1/2)
    
  3. Moreno Fernández, L.M. -> Fernández, L.M.M.: 1.76
    1.7587421 = sum of:
      1.7587421 = product of:
        3.5174842 = sum of:
          3.5174842 = weight(author_txt:fernández in 5951) [ClassicSimilarity], result of:
            3.5174842 = score(doc=5951,freq=2.0), product of:
              0.6827069 = queryWeight, product of:
                8.3273115 = idf(docFreq=27, maxDocs=42596)
                0.08198407 = queryNorm
              5.1522613 = fieldWeight in 5951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.3273115 = idf(docFreq=27, maxDocs=42596)
                0.4375 = fieldNorm(doc=5951)
        0.5 = coord(1/2)
    
  4. Esteban, A. Díaz -> Díaz Esteban, A.: 1.67
    1.6691899 = sum of:
      1.6691899 = product of:
        3.3383799 = sum of:
          3.3383799 = weight(author_txt:díaz in 3748) [ClassicSimilarity], result of:
            3.3383799 = score(doc=3748,freq=2.0), product of:
              0.7306923 = queryWeight, product of:
                1.0345467 = boost
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.08198407 = queryNorm
              4.56879 = fieldWeight in 3748, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.375 = fieldNorm(doc=3748)
        0.5 = coord(1/2)
    
  5. Díaz, N.P. Cruz -> Cruz Díaz, N.P.: 1.67
    1.6691899 = sum of:
      1.6691899 = product of:
        3.3383799 = sum of:
          3.3383799 = weight(author_txt:díaz in 1234) [ClassicSimilarity], result of:
            3.3383799 = score(doc=1234,freq=2.0), product of:
              0.7306923 = queryWeight, product of:
                1.0345467 = boost
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.08198407 = queryNorm
              4.56879 = fieldWeight in 1234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.614993 = idf(docFreq=20, maxDocs=42596)
                0.375 = fieldNorm(doc=1234)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.42
    0.4175431 = sum of:
      0.4175431 = product of:
        0.9489616 = sum of:
          0.026746716 = weight(abstract_txt:results in 4387) [ClassicSimilarity], result of:
            0.026746716 = score(doc=4387,freq=1.0), product of:
              0.081350416 = queryWeight, product of:
                1.0350237 = boost
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.022411454 = queryNorm
              0.32878402 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.018087214 = weight(abstract_txt:that in 4387) [ClassicSimilarity], result of:
            0.018087214 = score(doc=4387,freq=2.0), product of:
              0.05694396 = queryWeight, product of:
                1.0605713 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.022411454 = queryNorm
              0.3176318 = fieldWeight in 4387, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.013584798 = weight(abstract_txt:this in 4387) [ClassicSimilarity], result of:
            0.013584798 = score(doc=4387,freq=1.0), product of:
              0.05928052 = queryWeight, product of:
                1.0821116 = boost
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.022411454 = queryNorm
              0.22916126 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.021013454 = weight(abstract_txt:with in 4387) [ClassicSimilarity], result of:
            0.021013454 = score(doc=4387,freq=2.0), product of:
              0.06293103 = queryWeight, product of:
                1.1149322 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.022411454 = queryNorm
              0.33391243 = fieldWeight in 4387, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.04116673 = weight(abstract_txt:text in 4387) [ClassicSimilarity], result of:
            0.04116673 = score(doc=4387,freq=1.0), product of:
              0.108445205 = queryWeight, product of:
                1.1950212 = boost
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.022411454 = queryNorm
              0.37960857 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.053322203 = weight(abstract_txt:support in 4387) [ClassicSimilarity], result of:
            0.053322203 = score(doc=4387,freq=1.0), product of:
              0.12886001 = queryWeight, product of:
                1.3026552 = boost
                4.413861 = idf(docFreq=1401, maxDocs=42596)
                0.022411454 = queryNorm
              0.41379946 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.413861 = idf(docFreq=1401, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.1056412 = weight(abstract_txt:significantly in 4387) [ClassicSimilarity], result of:
            0.1056412 = score(doc=4387,freq=1.0), product of:
              0.20326768 = queryWeight, product of:
                1.6360801 = boost
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.022411454 = queryNorm
              0.5197147 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.17123778 = weight(abstract_txt:vector in 4387) [ClassicSimilarity], result of:
            0.17123778 = score(doc=4387,freq=1.0), product of:
              0.28048685 = queryWeight, product of:
                1.9218817 = boost
                6.512021 = idf(docFreq=171, maxDocs=42596)
                0.022411454 = queryNorm
              0.610502 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.512021 = idf(docFreq=171, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.18171678 = weight(abstract_txt:categorization in 4387) [ClassicSimilarity], result of:
            0.18171678 = score(doc=4387,freq=1.0), product of:
              0.29181623 = queryWeight, product of:
                1.9603117 = boost
                6.6422358 = idf(docFreq=150, maxDocs=42596)
                0.022411454 = queryNorm
              0.62270963 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6422358 = idf(docFreq=150, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.09298439 = weight(abstract_txt:performance in 4387) [ClassicSimilarity], result of:
            0.09298439 = score(doc=4387,freq=1.0), product of:
              0.21370593 = queryWeight, product of:
                2.054586 = boost
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.022411454 = queryNorm
              0.43510443 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
          0.22346035 = weight(abstract_txt:machines in 4387) [ClassicSimilarity], result of:
            0.22346035 = score(doc=4387,freq=1.0), product of:
              0.33494982 = queryWeight, product of:
                2.1001983 = boost
                7.116221 = idf(docFreq=93, maxDocs=42596)
                0.022411454 = queryNorm
              0.6671457 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.116221 = idf(docFreq=93, maxDocs=42596)
                0.09375 = fieldNorm(doc=4387)
        0.44 = coord(11/25)
    
  2. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.32
    0.3202714 = sum of:
      0.3202714 = product of:
        0.72788954 = sum of:
          0.014768148 = weight(abstract_txt:that in 2011) [ClassicSimilarity], result of:
            0.014768148 = score(doc=2011,freq=3.0), product of:
              0.05694396 = queryWeight, product of:
                1.0605713 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.022411454 = queryNorm
              0.2593453 = fieldWeight in 2011, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.01280787 = weight(abstract_txt:this in 2011) [ClassicSimilarity], result of:
            0.01280787 = score(doc=2011,freq=2.0), product of:
              0.05928052 = queryWeight, product of:
                1.0821116 = boost
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.022411454 = queryNorm
              0.2160553 = fieldWeight in 2011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.017157413 = weight(abstract_txt:with in 2011) [ClassicSimilarity], result of:
            0.017157413 = score(doc=2011,freq=3.0), product of:
              0.06293103 = queryWeight, product of:
                1.1149322 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.022411454 = queryNorm
              0.27263835 = fieldWeight in 2011, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.061367735 = weight(abstract_txt:text in 2011) [ClassicSimilarity], result of:
            0.061367735 = score(doc=2011,freq=5.0), product of:
              0.108445205 = queryWeight, product of:
                1.1950212 = boost
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.022411454 = queryNorm
              0.56588703 = fieldWeight in 2011, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.035548136 = weight(abstract_txt:support in 2011) [ClassicSimilarity], result of:
            0.035548136 = score(doc=2011,freq=1.0), product of:
              0.12886001 = queryWeight, product of:
                1.3026552 = boost
                4.413861 = idf(docFreq=1401, maxDocs=42596)
                0.022411454 = queryNorm
              0.2758663 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.413861 = idf(docFreq=1401, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.07042747 = weight(abstract_txt:significantly in 2011) [ClassicSimilarity], result of:
            0.07042747 = score(doc=2011,freq=1.0), product of:
              0.20326768 = queryWeight, product of:
                1.6360801 = boost
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.022411454 = queryNorm
              0.34647647 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.087038666 = weight(abstract_txt:improving in 2011) [ClassicSimilarity], result of:
            0.087038666 = score(doc=2011,freq=1.0), product of:
              0.2340894 = queryWeight, product of:
                1.7557443 = boost
                5.9490886 = idf(docFreq=301, maxDocs=42596)
                0.022411454 = queryNorm
              0.37181804 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9490886 = idf(docFreq=301, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.11415852 = weight(abstract_txt:vector in 2011) [ClassicSimilarity], result of:
            0.11415852 = score(doc=2011,freq=1.0), product of:
              0.28048685 = queryWeight, product of:
                1.9218817 = boost
                6.512021 = idf(docFreq=171, maxDocs=42596)
                0.022411454 = queryNorm
              0.40700132 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.512021 = idf(docFreq=171, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.12397919 = weight(abstract_txt:performance in 2011) [ClassicSimilarity], result of:
            0.12397919 = score(doc=2011,freq=4.0), product of:
              0.21370593 = queryWeight, product of:
                2.054586 = boost
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.022411454 = queryNorm
              0.5801392 = fieldWeight in 2011, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.14897355 = weight(abstract_txt:machines in 2011) [ClassicSimilarity], result of:
            0.14897355 = score(doc=2011,freq=1.0), product of:
              0.33494982 = queryWeight, product of:
                2.1001983 = boost
                7.116221 = idf(docFreq=93, maxDocs=42596)
                0.022411454 = queryNorm
              0.4447638 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.116221 = idf(docFreq=93, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
          0.041662812 = weight(abstract_txt:different in 2011) [ClassicSimilarity], result of:
            0.041662812 = score(doc=2011,freq=1.0), product of:
              0.18047456 = queryWeight, product of:
                2.1801853 = boost
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.022411454 = queryNorm
              0.23085143 = fieldWeight in 2011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.0625 = fieldNorm(doc=2011)
        0.44 = coord(11/25)
    
  3. Mostafa, J.; Quiroga, L.M.; Palakal, M.: Filtering medical documents using automated and human classification methods (1998) 0.28
    0.28440097 = sum of:
      0.28440097 = product of:
        0.88875306 = sum of:
          0.08067883 = weight(abstract_txt:improves in 3327) [ClassicSimilarity], result of:
            0.08067883 = score(doc=3327,freq=1.0), product of:
              0.15221706 = queryWeight, product of:
                1.0011221 = boost
                6.7843184 = idf(docFreq=130, maxDocs=42596)
                0.022411454 = queryNorm
              0.5300249 = fieldWeight in 3327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7843184 = idf(docFreq=130, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.031521305 = weight(abstract_txt:results in 3327) [ClassicSimilarity], result of:
            0.031521305 = score(doc=3327,freq=2.0), product of:
              0.081350416 = queryWeight, product of:
                1.0350237 = boost
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.022411454 = queryNorm
              0.38747567 = fieldWeight in 3327, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.010657993 = weight(abstract_txt:that in 3327) [ClassicSimilarity], result of:
            0.010657993 = score(doc=3327,freq=1.0), product of:
              0.05694396 = queryWeight, product of:
                1.0605713 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.022411454 = queryNorm
              0.18716635 = fieldWeight in 3327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.011320665 = weight(abstract_txt:this in 3327) [ClassicSimilarity], result of:
            0.011320665 = score(doc=3327,freq=1.0), product of:
              0.05928052 = queryWeight, product of:
                1.0821116 = boost
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.022411454 = queryNorm
              0.19096771 = fieldWeight in 3327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.017511213 = weight(abstract_txt:with in 3327) [ClassicSimilarity], result of:
            0.017511213 = score(doc=3327,freq=2.0), product of:
              0.06293103 = queryWeight, product of:
                1.1149322 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.022411454 = queryNorm
              0.27826038 = fieldWeight in 3327, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.15497398 = weight(abstract_txt:performance in 3327) [ClassicSimilarity], result of:
            0.15497398 = score(doc=3327,freq=4.0), product of:
              0.21370593 = queryWeight, product of:
                2.054586 = boost
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.022411454 = queryNorm
              0.725174 = fieldWeight in 3327, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.052078515 = weight(abstract_txt:different in 3327) [ClassicSimilarity], result of:
            0.052078515 = score(doc=3327,freq=1.0), product of:
              0.18047456 = queryWeight, product of:
                2.1801853 = boost
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.022411454 = queryNorm
              0.2885643 = fieldWeight in 3327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
          0.5300106 = weight(abstract_txt:filtering in 3327) [ClassicSimilarity], result of:
            0.5300106 = score(doc=3327,freq=6.0), product of:
              0.42377627 = queryWeight, product of:
                2.8932393 = boost
                6.5355515 = idf(docFreq=167, maxDocs=42596)
                0.022411454 = queryNorm
              1.2506849 = fieldWeight in 3327, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.5355515 = idf(docFreq=167, maxDocs=42596)
                0.078125 = fieldNorm(doc=3327)
        0.32 = coord(8/25)
    
  4. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.26
    0.25646868 = sum of:
      0.25646868 = product of:
        0.71241295 = sum of:
          0.025217045 = weight(abstract_txt:results in 3580) [ClassicSimilarity], result of:
            0.025217045 = score(doc=3580,freq=2.0), product of:
              0.081350416 = queryWeight, product of:
                1.0350237 = boost
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.022411454 = queryNorm
              0.30998054 = fieldWeight in 3580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.01705279 = weight(abstract_txt:that in 3580) [ClassicSimilarity], result of:
            0.01705279 = score(doc=3580,freq=4.0), product of:
              0.05694396 = queryWeight, product of:
                1.0605713 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.022411454 = queryNorm
              0.29946616 = fieldWeight in 3580, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.009056532 = weight(abstract_txt:this in 3580) [ClassicSimilarity], result of:
            0.009056532 = score(doc=3580,freq=1.0), product of:
              0.05928052 = queryWeight, product of:
                1.0821116 = boost
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.022411454 = queryNorm
              0.15277417 = fieldWeight in 3580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.0099058375 = weight(abstract_txt:with in 3580) [ClassicSimilarity], result of:
            0.0099058375 = score(doc=3580,freq=1.0), product of:
              0.06293103 = queryWeight, product of:
                1.1149322 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.022411454 = queryNorm
              0.15740784 = fieldWeight in 3580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.05165252 = weight(abstract_txt:three in 3580) [ClassicSimilarity], result of:
            0.05165252 = score(doc=3580,freq=2.0), product of:
              0.13120729 = queryWeight, product of:
                1.3144661 = boost
                4.4538803 = idf(docFreq=1346, maxDocs=42596)
                0.022411454 = queryNorm
              0.39367113 = fieldWeight in 3580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4538803 = idf(docFreq=1346, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.07042747 = weight(abstract_txt:significantly in 3580) [ClassicSimilarity], result of:
            0.07042747 = score(doc=3580,freq=1.0), product of:
              0.20326768 = queryWeight, product of:
                1.6360801 = boost
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.022411454 = queryNorm
              0.34647647 = fieldWeight in 3580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.12397919 = weight(abstract_txt:performance in 3580) [ClassicSimilarity], result of:
            0.12397919 = score(doc=3580,freq=4.0), product of:
              0.21370593 = queryWeight, product of:
                2.054586 = boost
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.022411454 = queryNorm
              0.5801392 = fieldWeight in 3580, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.05892011 = weight(abstract_txt:different in 3580) [ClassicSimilarity], result of:
            0.05892011 = score(doc=3580,freq=2.0), product of:
              0.18047456 = queryWeight, product of:
                2.1801853 = boost
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.022411454 = queryNorm
              0.3264732 = fieldWeight in 3580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6936228 = idf(docFreq=2880, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
          0.34620145 = weight(abstract_txt:filtering in 3580) [ClassicSimilarity], result of:
            0.34620145 = score(doc=3580,freq=4.0), product of:
              0.42377627 = queryWeight, product of:
                2.8932393 = boost
                6.5355515 = idf(docFreq=167, maxDocs=42596)
                0.022411454 = queryNorm
              0.81694394 = fieldWeight in 3580, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5355515 = idf(docFreq=167, maxDocs=42596)
                0.0625 = fieldNorm(doc=3580)
        0.36 = coord(9/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.25
    0.24767551 = sum of:
      0.24767551 = product of:
        0.6879875 = sum of:
          0.096814595 = weight(abstract_txt:improves in 2596) [ClassicSimilarity], result of:
            0.096814595 = score(doc=2596,freq=1.0), product of:
              0.15221706 = queryWeight, product of:
                1.0011221 = boost
                6.7843184 = idf(docFreq=130, maxDocs=42596)
                0.022411454 = queryNorm
              0.63602984 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7843184 = idf(docFreq=130, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.026746716 = weight(abstract_txt:results in 2596) [ClassicSimilarity], result of:
            0.026746716 = score(doc=2596,freq=1.0), product of:
              0.081350416 = queryWeight, product of:
                1.0350237 = boost
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.022411454 = queryNorm
              0.32878402 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5070295 = idf(docFreq=3471, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.022152223 = weight(abstract_txt:that in 2596) [ClassicSimilarity], result of:
            0.022152223 = score(doc=2596,freq=3.0), product of:
              0.05694396 = queryWeight, product of:
                1.0605713 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.022411454 = queryNorm
              0.38901794 = fieldWeight in 2596, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.013584798 = weight(abstract_txt:this in 2596) [ClassicSimilarity], result of:
            0.013584798 = score(doc=2596,freq=1.0), product of:
              0.05928052 = queryWeight, product of:
                1.0821116 = boost
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.022411454 = queryNorm
              0.22916126 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4443867 = idf(docFreq=10047, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.014858756 = weight(abstract_txt:with in 2596) [ClassicSimilarity], result of:
            0.014858756 = score(doc=2596,freq=1.0), product of:
              0.06293103 = queryWeight, product of:
                1.1149322 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.022411454 = queryNorm
              0.23611176 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.058218546 = weight(abstract_txt:text in 2596) [ClassicSimilarity], result of:
            0.058218546 = score(doc=2596,freq=2.0), product of:
              0.108445205 = queryWeight, product of:
                1.1950212 = boost
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.022411454 = queryNorm
              0.5368476 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049158 = idf(docFreq=2018, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.1056412 = weight(abstract_txt:significantly in 2596) [ClassicSimilarity], result of:
            0.1056412 = score(doc=2596,freq=1.0), product of:
              0.20326768 = queryWeight, product of:
                1.6360801 = boost
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.022411454 = queryNorm
              0.5197147 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5436234 = idf(docFreq=452, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.25698632 = weight(abstract_txt:categorization in 2596) [ClassicSimilarity], result of:
            0.25698632 = score(doc=2596,freq=2.0), product of:
              0.29181623 = queryWeight, product of:
                1.9603117 = boost
                6.6422358 = idf(docFreq=150, maxDocs=42596)
                0.022411454 = queryNorm
              0.8806443 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6422358 = idf(docFreq=150, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
          0.09298439 = weight(abstract_txt:performance in 2596) [ClassicSimilarity], result of:
            0.09298439 = score(doc=2596,freq=1.0), product of:
              0.21370593 = queryWeight, product of:
                2.054586 = boost
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.022411454 = queryNorm
              0.43510443 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6411138 = idf(docFreq=1116, maxDocs=42596)
                0.09375 = fieldNorm(doc=2596)
        0.36 = coord(9/25)