Document (#28392)

Author
Sebastiani, F.
Title
¬A tutorial an automated text categorisation
Source
http://net.pku.edu.cn/~webg/papers/sebastiani99tutorial.pdf
Year
1999
Abstract
The automated categorisation (or classification) of texts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, i.e. manually building a set of rules encoding expert knowledge an how to classify documents. In the '90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest. A newer paradigm based an machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence. In this tutorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.
Content
Aus: Proceedings of THAI-99, European Symposium on Telematics, Hypermedia and Artificial Intelligence
Theme
Automatisches Klassifizieren
Computerlinguistik

Similar documents (author)

  1. Sebastiani, F.: On the role of logic in information retrieval (1998) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:sebastiani in 3141) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 3141, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=3141)
    
  2. Sebastiani, F.: Machine learning in automated text categorization (2002) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:sebastiani in 5390) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 5390, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=5390)
    
  3. Sebastiani, F.: Classification of text, automatic (2006) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:sebastiani in 4) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 4, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=4)
    
  4. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 4.80
    4.797702 = sum of:
      4.797702 = weight(author_txt:sebastiani in 5457) [ClassicSimilarity], result of:
        4.797702 = fieldWeight in 5457, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.5 = fieldNorm(doc=5457)
    
  5. Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 4.80
    4.797702 = sum of:
      4.797702 = weight(author_txt:sebastiani in 173) [ClassicSimilarity], result of:
        4.797702 = fieldWeight in 173, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.5 = fieldNorm(doc=173)
    

Similar documents (content)

  1. Sebastiani, F.: Machine learning in automated text categorization (2002) 0.94
    0.9399716 = sum of:
      0.9399716 = product of:
        1.4687057 = sum of:
          0.030266957 = weight(abstract_txt:approach in 5390) [ClassicSimilarity], result of:
            0.030266957 = score(doc=5390,freq=3.0), product of:
              0.059553817 = queryWeight, product of:
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015856272 = queryNorm
              0.50822866 = fieldWeight in 5390, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.07714566 = weight(abstract_txt:inductive in 5390) [ClassicSimilarity], result of:
            0.07714566 = score(doc=5390,freq=1.0), product of:
              0.12720431 = queryWeight, product of:
                1.0334301 = boost
                7.762822 = idf(docFreq=49, maxDocs=43254)
                0.015856272 = queryNorm
              0.60647047 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.762822 = idf(docFreq=49, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.024586733 = weight(abstract_txt:within in 5390) [ClassicSimilarity], result of:
            0.024586733 = score(doc=5390,freq=1.0), product of:
              0.07477757 = queryWeight, product of:
                1.1205491 = boost
                4.208617 = idf(docFreq=1747, maxDocs=43254)
                0.015856272 = queryNorm
              0.32879823 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.208617 = idf(docFreq=1747, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.10430177 = weight(abstract_txt:savings in 5390) [ClassicSimilarity], result of:
            0.10430177 = score(doc=5390,freq=1.0), product of:
              0.15553279 = queryWeight, product of:
                1.1427236 = boost
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.015856272 = queryNorm
              0.67060953 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.026995951 = weight(abstract_txt:general in 5390) [ClassicSimilarity], result of:
            0.026995951 = score(doc=5390,freq=1.0), product of:
              0.079585984 = queryWeight, product of:
                1.156015 = boost
                4.341822 = idf(docFreq=1529, maxDocs=43254)
                0.015856272 = queryNorm
              0.33920485 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.341822 = idf(docFreq=1529, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.113985576 = weight(abstract_txt:witnessed in 5390) [ClassicSimilarity], result of:
            0.113985576 = score(doc=5390,freq=1.0), product of:
              0.16501652 = queryWeight, product of:
                1.1770473 = boost
                8.841632 = idf(docFreq=16, maxDocs=43254)
                0.015856272 = queryNorm
              0.6907525 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.841632 = idf(docFreq=16, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.13166437 = weight(abstract_txt:booming in 5390) [ClassicSimilarity], result of:
            0.13166437 = score(doc=5390,freq=1.0), product of:
              0.18166572 = queryWeight, product of:
                1.2349993 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.015856272 = queryNorm
              0.7247617 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.065621585 = weight(abstract_txt:categories in 5390) [ClassicSimilarity], result of:
            0.065621585 = score(doc=5390,freq=2.0), product of:
              0.11419804 = queryWeight, product of:
                1.38476 = boost
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.015856272 = queryNorm
              0.5746297 = fieldWeight in 5390, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.0702504 = weight(abstract_txt:machine in 5390) [ClassicSimilarity], result of:
            0.0702504 = score(doc=5390,freq=2.0), product of:
              0.119507015 = queryWeight, product of:
                1.4165826 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.015856272 = queryNorm
              0.58783495 = fieldWeight in 5390, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.062117152 = weight(abstract_txt:expert in 5390) [ClassicSimilarity], result of:
            0.062117152 = score(doc=5390,freq=1.0), product of:
              0.1387113 = queryWeight, product of:
                1.5261637 = boost
                5.7320457 = idf(docFreq=380, maxDocs=43254)
                0.015856272 = queryNorm
              0.44781607 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7320457 = idf(docFreq=380, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.03285902 = weight(abstract_txt:text in 5390) [ClassicSimilarity], result of:
            0.03285902 = score(doc=5390,freq=1.0), product of:
              0.10385745 = queryWeight, product of:
                1.617371 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.015856272 = queryNorm
              0.31638578 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.04879989 = weight(abstract_txt:documents in 5390) [ClassicSimilarity], result of:
            0.04879989 = score(doc=5390,freq=2.0), product of:
              0.10730101 = queryWeight, product of:
                1.6439656 = boost
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.015856272 = queryNorm
              0.45479432 = fieldWeight in 5390, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.09450032 = weight(abstract_txt:learning in 5390) [ClassicSimilarity], result of:
            0.09450032 = score(doc=5390,freq=3.0), product of:
              0.14562961 = queryWeight, product of:
                1.9152068 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.015856272 = queryNorm
              0.6489087 = fieldWeight in 5390, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.08783686 = weight(abstract_txt:automated in 5390) [ClassicSimilarity], result of:
            0.08783686 = score(doc=5390,freq=1.0), product of:
              0.20004106 = queryWeight, product of:
                2.2446592 = boost
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.015856272 = queryNorm
              0.4390942 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.11976928 = weight(abstract_txt:paradigm in 5390) [ClassicSimilarity], result of:
            0.11976928 = score(doc=5390,freq=1.0), product of:
              0.2459791 = queryWeight, product of:
                2.4890862 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.015856272 = queryNorm
              0.48690838 = fieldWeight in 5390, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
          0.37800404 = weight(abstract_txt:classifier in 5390) [ClassicSimilarity], result of:
            0.37800404 = score(doc=5390,freq=4.0), product of:
              0.3334102 = queryWeight, product of:
                2.8978791 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.015856272 = queryNorm
              1.1337507 = fieldWeight in 5390, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.078125 = fieldNorm(doc=5390)
        0.64 = coord(16/25)
    
  2. Sebastiani, F.: Classification of text, automatic (2006) 0.23
    0.2278273 = sum of:
      0.2278273 = product of:
        0.7119603 = sum of:
          0.020969564 = weight(abstract_txt:approach in 4) [ClassicSimilarity], result of:
            0.020969564 = score(doc=4,freq=1.0), product of:
              0.059553817 = queryWeight, product of:
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015856272 = queryNorm
              0.35211116 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.055533543 = weight(abstract_txt:automatic in 4) [ClassicSimilarity], result of:
            0.055533543 = score(doc=4,freq=1.0), product of:
              0.11399529 = queryWeight, product of:
                1.3835303 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.015856272 = queryNorm
              0.48715645 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.0787459 = weight(abstract_txt:categories in 4) [ClassicSimilarity], result of:
            0.0787459 = score(doc=4,freq=2.0), product of:
              0.11419804 = queryWeight, product of:
                1.38476 = boost
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.015856272 = queryNorm
              0.68955564 = fieldWeight in 4, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.059609447 = weight(abstract_txt:machine in 4) [ClassicSimilarity], result of:
            0.059609447 = score(doc=4,freq=1.0), product of:
              0.119507015 = queryWeight, product of:
                1.4165826 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.015856272 = queryNorm
              0.49879456 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.055763606 = weight(abstract_txt:text in 4) [ClassicSimilarity], result of:
            0.055763606 = score(doc=4,freq=2.0), product of:
              0.10385745 = queryWeight, product of:
                1.617371 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.015856272 = queryNorm
              0.5369245 = fieldWeight in 4, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.065471746 = weight(abstract_txt:learning in 4) [ClassicSimilarity], result of:
            0.065471746 = score(doc=4,freq=1.0), product of:
              0.14562961 = queryWeight, product of:
                1.9152068 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.015856272 = queryNorm
              0.44957712 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.14906411 = weight(abstract_txt:automated in 4) [ClassicSimilarity], result of:
            0.14906411 = score(doc=4,freq=2.0), product of:
              0.20004106 = queryWeight, product of:
                2.2446592 = boost
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.015856272 = queryNorm
              0.74516755 = fieldWeight in 4, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
          0.22680242 = weight(abstract_txt:classifier in 4) [ClassicSimilarity], result of:
            0.22680242 = score(doc=4,freq=1.0), product of:
              0.3334102 = queryWeight, product of:
                2.8978791 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.015856272 = queryNorm
              0.6802504 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.09375 = fieldNorm(doc=4)
        0.32 = coord(8/25)
    
  3. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.22
    0.22316009 = sum of:
      0.22316009 = product of:
        0.6973753 = sum of:
          0.06171653 = weight(abstract_txt:inductive in 4453) [ClassicSimilarity], result of:
            0.06171653 = score(doc=4453,freq=1.0), product of:
              0.12720431 = queryWeight, product of:
                1.0334301 = boost
                7.762822 = idf(docFreq=49, maxDocs=43254)
                0.015856272 = queryNorm
              0.48517638 = fieldWeight in 4453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.762822 = idf(docFreq=49, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.021596761 = weight(abstract_txt:general in 4453) [ClassicSimilarity], result of:
            0.021596761 = score(doc=4453,freq=1.0), product of:
              0.079585984 = queryWeight, product of:
                1.156015 = boost
                4.341822 = idf(docFreq=1529, maxDocs=43254)
                0.015856272 = queryNorm
              0.27136388 = fieldWeight in 4453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.341822 = idf(docFreq=1529, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.05620032 = weight(abstract_txt:machine in 4453) [ClassicSimilarity], result of:
            0.05620032 = score(doc=4453,freq=2.0), product of:
              0.119507015 = queryWeight, product of:
                1.4165826 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.015856272 = queryNorm
              0.47026798 = fieldWeight in 4453, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.069549434 = weight(abstract_txt:text in 4453) [ClassicSimilarity], result of:
            0.069549434 = score(doc=4453,freq=7.0), product of:
              0.10385745 = queryWeight, product of:
                1.617371 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.015856272 = queryNorm
              0.6696625 = fieldWeight in 4453, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.055210773 = weight(abstract_txt:documents in 4453) [ClassicSimilarity], result of:
            0.055210773 = score(doc=4453,freq=4.0), product of:
              0.10730101 = queryWeight, product of:
                1.6439656 = boost
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.015856272 = queryNorm
              0.51454103 = fieldWeight in 4453, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.123454705 = weight(abstract_txt:learning in 4453) [ClassicSimilarity], result of:
            0.123454705 = score(doc=4453,freq=8.0), product of:
              0.14562961 = queryWeight, product of:
                1.9152068 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.015856272 = queryNorm
              0.84773076 = fieldWeight in 4453, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.09581543 = weight(abstract_txt:paradigm in 4453) [ClassicSimilarity], result of:
            0.09581543 = score(doc=4453,freq=1.0), product of:
              0.2459791 = queryWeight, product of:
                2.4890862 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.015856272 = queryNorm
              0.3895267 = fieldWeight in 4453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
          0.21383137 = weight(abstract_txt:classifier in 4453) [ClassicSimilarity], result of:
            0.21383137 = score(doc=4453,freq=2.0), product of:
              0.3334102 = queryWeight, product of:
                2.8978791 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.015856272 = queryNorm
              0.6413462 = fieldWeight in 4453, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.0625 = fieldNorm(doc=4453)
        0.32 = coord(8/25)
    
  4. Li, T.; Zhu, S.; Ogihara, M.: Hierarchical document classification using automatically generated hierarchy (2007) 0.17
    0.17402795 = sum of:
      0.17402795 = product of:
        0.54383737 = sum of:
          0.017474636 = weight(abstract_txt:approach in 1262) [ClassicSimilarity], result of:
            0.017474636 = score(doc=1262,freq=1.0), product of:
              0.059553817 = queryWeight, product of:
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.015856272 = queryNorm
              0.29342598 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.113985576 = weight(abstract_txt:witnessed in 1262) [ClassicSimilarity], result of:
            0.113985576 = score(doc=1262,freq=1.0), product of:
              0.16501652 = queryWeight, product of:
                1.1770473 = boost
                8.841632 = idf(docFreq=16, maxDocs=43254)
                0.015856272 = queryNorm
              0.6907525 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.841632 = idf(docFreq=16, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.13166437 = weight(abstract_txt:booming in 1262) [ClassicSimilarity], result of:
            0.13166437 = score(doc=1262,freq=1.0), product of:
              0.18166572 = queryWeight, product of:
                1.2349993 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.015856272 = queryNorm
              0.7247617 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.046277948 = weight(abstract_txt:automatic in 1262) [ClassicSimilarity], result of:
            0.046277948 = score(doc=1262,freq=1.0), product of:
              0.11399529 = queryWeight, product of:
                1.3835303 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.015856272 = queryNorm
              0.4059637 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.065621585 = weight(abstract_txt:categories in 1262) [ClassicSimilarity], result of:
            0.065621585 = score(doc=1262,freq=2.0), product of:
              0.11419804 = queryWeight, product of:
                1.38476 = boost
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.015856272 = queryNorm
              0.5746297 = fieldWeight in 1262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.04646967 = weight(abstract_txt:text in 1262) [ClassicSimilarity], result of:
            0.04646967 = score(doc=1262,freq=2.0), product of:
              0.10385745 = queryWeight, product of:
                1.617371 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.015856272 = queryNorm
              0.44743705 = fieldWeight in 1262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.034506734 = weight(abstract_txt:documents in 1262) [ClassicSimilarity], result of:
            0.034506734 = score(doc=1262,freq=1.0), product of:
              0.10730101 = queryWeight, product of:
                1.6439656 = boost
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.015856272 = queryNorm
              0.32158816 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1163282 = idf(docFreq=1916, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.08783686 = weight(abstract_txt:automated in 1262) [ClassicSimilarity], result of:
            0.08783686 = score(doc=1262,freq=1.0), product of:
              0.20004106 = queryWeight, product of:
                2.2446592 = boost
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.015856272 = queryNorm
              0.4390942 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6204057 = idf(docFreq=425, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.14
    0.13696148 = sum of:
      0.13696148 = product of:
        0.5706728 = sum of:
          0.055533543 = weight(abstract_txt:automatic in 3596) [ClassicSimilarity], result of:
            0.055533543 = score(doc=3596,freq=1.0), product of:
              0.11399529 = queryWeight, product of:
                1.3835303 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.015856272 = queryNorm
              0.48715645 = fieldWeight in 3596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
          0.05568176 = weight(abstract_txt:categories in 3596) [ClassicSimilarity], result of:
            0.05568176 = score(doc=3596,freq=1.0), product of:
              0.11419804 = queryWeight, product of:
                1.38476 = boost
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.015856272 = queryNorm
              0.48758948 = fieldWeight in 3596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2009544 = idf(docFreq=647, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
          0.08430048 = weight(abstract_txt:machine in 3596) [ClassicSimilarity], result of:
            0.08430048 = score(doc=3596,freq=2.0), product of:
              0.119507015 = queryWeight, product of:
                1.4165826 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.015856272 = queryNorm
              0.70540196 = fieldWeight in 3596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
          0.055763606 = weight(abstract_txt:text in 3596) [ClassicSimilarity], result of:
            0.055763606 = score(doc=3596,freq=2.0), product of:
              0.10385745 = queryWeight, product of:
                1.617371 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.015856272 = queryNorm
              0.5369245 = fieldWeight in 3596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
          0.09259103 = weight(abstract_txt:learning in 3596) [ClassicSimilarity], result of:
            0.09259103 = score(doc=3596,freq=2.0), product of:
              0.14562961 = queryWeight, product of:
                1.9152068 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.015856272 = queryNorm
              0.6357981 = fieldWeight in 3596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
          0.22680242 = weight(abstract_txt:classifier in 3596) [ClassicSimilarity], result of:
            0.22680242 = score(doc=3596,freq=1.0), product of:
              0.3334102 = queryWeight, product of:
                2.8978791 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.015856272 = queryNorm
              0.6802504 = fieldWeight in 3596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.09375 = fieldNorm(doc=3596)
        0.24 = coord(6/25)