Document (#28389)

Author
Sebastiani, F.
Title
¬A tutorial an automated text categorisation
Source
http://net.pku.edu.cn/~webg/papers/sebastiani99tutorial.pdf
Year
1999
Abstract
The automated categorisation (or classification) of texts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, i.e. manually building a set of rules encoding expert knowledge an how to classify documents. In the '90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest. A newer paradigm based an machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence. In this tutorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.
Content
Aus: Proceedings of THAI-99, European Symposium on Telematics, Hypermedia and Artificial Intelligence
Theme
Automatisches Klassifizieren
Computerlinguistik

Similar documents (author)

  1. Sebastiani, F.: On the role of logic in information retrieval (1998) 6.00
    6.0014763 = sum of:
      6.0014763 = weight(author_txt:sebastiani in 2138) [ClassicSimilarity], result of:
        6.0014763 = fieldWeight in 2138, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.602362 = idf(docFreq=7, maxDocs=43556)
          0.625 = fieldNorm(doc=2138)
    
  2. Sebastiani, F.: Machine learning in automated text categorization (2002) 6.00
    6.0014763 = sum of:
      6.0014763 = weight(author_txt:sebastiani in 4387) [ClassicSimilarity], result of:
        6.0014763 = fieldWeight in 4387, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.602362 = idf(docFreq=7, maxDocs=43556)
          0.625 = fieldNorm(doc=4387)
    
  3. Sebastiani, F.: Classification of text, automatic (2006) 6.00
    6.0014763 = sum of:
      6.0014763 = weight(author_txt:sebastiani in 1) [ClassicSimilarity], result of:
        6.0014763 = fieldWeight in 1, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.602362 = idf(docFreq=7, maxDocs=43556)
          0.625 = fieldNorm(doc=1)
    
  4. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 4.80
    4.801181 = sum of:
      4.801181 = weight(author_txt:sebastiani in 4454) [ClassicSimilarity], result of:
        4.801181 = fieldWeight in 4454, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.602362 = idf(docFreq=7, maxDocs=43556)
          0.5 = fieldNorm(doc=4454)
    
  5. Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 4.80
    4.801181 = sum of:
      4.801181 = weight(author_txt:sebastiani in 170) [ClassicSimilarity], result of:
        4.801181 = fieldWeight in 170, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.602362 = idf(docFreq=7, maxDocs=43556)
          0.5 = fieldNorm(doc=170)
    

Similar documents (content)

  1. Sebastiani, F.: Machine learning in automated text categorization (2002) 0.94
    0.93990016 = sum of:
      0.93990016 = product of:
        1.4685941 = sum of:
          0.030312607 = weight(abstract_txt:approach in 4387) [ClassicSimilarity], result of:
            0.030312607 = score(doc=4387,freq=3.0), product of:
              0.05963105 = queryWeight, product of:
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015873484 = queryNorm
              0.50833595 = fieldWeight in 4387, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.07683045 = weight(abstract_txt:inductive in 4387) [ClassicSimilarity], result of:
            0.07683045 = score(doc=4387,freq=1.0), product of:
              0.12689453 = queryWeight, product of:
                1.0315024 = boost
                7.749977 = idf(docFreq=50, maxDocs=43556)
                0.015873484 = queryNorm
              0.60546696 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.749977 = idf(docFreq=50, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.024560543 = weight(abstract_txt:within in 4387) [ClassicSimilarity], result of:
            0.024560543 = score(doc=4387,freq=1.0), product of:
              0.074746236 = queryWeight, product of:
                1.1195885 = boost
                4.205897 = idf(docFreq=1764, maxDocs=43556)
                0.015873484 = queryNorm
              0.32858568 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.205897 = idf(docFreq=1764, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.104647025 = weight(abstract_txt:savings in 4387) [ClassicSimilarity], result of:
            0.104647025 = score(doc=4387,freq=1.0), product of:
              0.15592124 = queryWeight, product of:
                1.1434085 = boost
                8.59076 = idf(docFreq=21, maxDocs=43556)
                0.015873484 = queryNorm
              0.6711531 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.59076 = idf(docFreq=21, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.026991498 = weight(abstract_txt:general in 4387) [ClassicSimilarity], result of:
            0.026991498 = score(doc=4387,freq=1.0), product of:
              0.079600416 = queryWeight, product of:
                1.155371 = boost
                4.3403187 = idf(docFreq=1542, maxDocs=43556)
                0.015873484 = queryNorm
              0.3390874 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3403187 = idf(docFreq=1542, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.11435474 = weight(abstract_txt:witnessed in 4387) [ClassicSimilarity], result of:
            0.11435474 = score(doc=4387,freq=1.0), product of:
              0.1654208 = queryWeight, product of:
                1.1777248 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.015873484 = queryNorm
              0.691296 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.13207622 = weight(abstract_txt:booming in 4387) [ClassicSimilarity], result of:
            0.13207622 = score(doc=4387,freq=1.0), product of:
              0.1820974 = queryWeight, product of:
                1.2356647 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.015873484 = queryNorm
              0.7253053 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.065305546 = weight(abstract_txt:categories in 4387) [ClassicSimilarity], result of:
            0.065305546 = score(doc=4387,freq=2.0), product of:
              0.11386425 = queryWeight, product of:
                1.381839 = boost
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.015873484 = queryNorm
              0.5735386 = fieldWeight in 4387, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.06970306 = weight(abstract_txt:machine in 4387) [ClassicSimilarity], result of:
            0.06970306 = score(doc=4387,freq=2.0), product of:
              0.118920095 = queryWeight, product of:
                1.4121844 = boost
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.015873484 = queryNorm
              0.58613354 = fieldWeight in 4387, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.062312663 = weight(abstract_txt:expert in 4387) [ClassicSimilarity], result of:
            0.062312663 = score(doc=4387,freq=1.0), product of:
              0.1390427 = queryWeight, product of:
                1.5269959 = boost
                5.736382 = idf(docFreq=381, maxDocs=43556)
                0.015873484 = queryNorm
              0.44815484 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.736382 = idf(docFreq=381, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.032879557 = weight(abstract_txt:text in 4387) [ClassicSimilarity], result of:
            0.032879557 = score(doc=4387,freq=1.0), product of:
              0.10393099 = queryWeight, product of:
                1.6168954 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.015873484 = queryNorm
              0.31635952 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.048997663 = weight(abstract_txt:documents in 4387) [ClassicSimilarity], result of:
            0.048997663 = score(doc=4387,freq=2.0), product of:
              0.10762207 = queryWeight, product of:
                1.6453568 = boost
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.015873484 = queryNorm
              0.45527524 = fieldWeight in 4387, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.09397003 = weight(abstract_txt:learning in 4387) [ClassicSimilarity], result of:
            0.09397003 = score(doc=4387,freq=3.0), product of:
              0.14512657 = queryWeight, product of:
                1.9106576 = boost
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.015873484 = queryNorm
              0.647504 = fieldWeight in 4387, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.08791084 = weight(abstract_txt:automated in 4387) [ClassicSimilarity], result of:
            0.08791084 = score(doc=4387,freq=1.0), product of:
              0.20021166 = queryWeight, product of:
                2.2441614 = boost
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.015873484 = queryNorm
              0.4390895 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.118317775 = weight(abstract_txt:paradigm in 4387) [ClassicSimilarity], result of:
            0.118317775 = score(doc=4387,freq=1.0), product of:
              0.24405877 = queryWeight, product of:
                2.477745 = boost
                6.2053394 = idf(docFreq=238, maxDocs=43556)
                0.015873484 = queryNorm
              0.48479214 = fieldWeight in 4387, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2053394 = idf(docFreq=238, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
          0.37942392 = weight(abstract_txt:classifier in 4387) [ClassicSimilarity], result of:
            0.37942392 = score(doc=4387,freq=4.0), product of:
              0.33434197 = queryWeight, product of:
                2.9000459 = boost
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.015873484 = queryNorm
              1.1348379 = fieldWeight in 4387, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.078125 = fieldNorm(doc=4387)
        0.64 = coord(16/25)
    
  2. Sebastiani, F.: Classification of text, automatic (2006) 0.23
    0.22778225 = sum of:
      0.22778225 = product of:
        0.7118195 = sum of:
          0.02100119 = weight(abstract_txt:approach in 1) [ClassicSimilarity], result of:
            0.02100119 = score(doc=1,freq=1.0), product of:
              0.05963105 = queryWeight, product of:
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015873484 = queryNorm
              0.3521855 = fieldWeight in 1, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.07836666 = weight(abstract_txt:categories in 1) [ClassicSimilarity], result of:
            0.07836666 = score(doc=1,freq=2.0), product of:
              0.11386425 = queryWeight, product of:
                1.381839 = boost
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.015873484 = queryNorm
              0.68824637 = fieldWeight in 1, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.05555984 = weight(abstract_txt:automatic in 1) [ClassicSimilarity], result of:
            0.05555984 = score(doc=1,freq=1.0), product of:
              0.1140645 = queryWeight, product of:
                1.3830537 = boost
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.015873484 = queryNorm
              0.48709142 = fieldWeight in 1, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.05914501 = weight(abstract_txt:machine in 1) [ClassicSimilarity], result of:
            0.05914501 = score(doc=1,freq=1.0), product of:
              0.118920095 = queryWeight, product of:
                1.4121844 = boost
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.015873484 = queryNorm
              0.49735084 = fieldWeight in 1, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.055798456 = weight(abstract_txt:text in 1) [ClassicSimilarity], result of:
            0.055798456 = score(doc=1,freq=2.0), product of:
              0.10393099 = queryWeight, product of:
                1.6168954 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.015873484 = queryNorm
              0.5368799 = fieldWeight in 1, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.06510435 = weight(abstract_txt:learning in 1) [ClassicSimilarity], result of:
            0.06510435 = score(doc=1,freq=1.0), product of:
              0.14512657 = queryWeight, product of:
                1.9106576 = boost
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.015873484 = queryNorm
              0.44860393 = fieldWeight in 1, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.14918965 = weight(abstract_txt:automated in 1) [ClassicSimilarity], result of:
            0.14918965 = score(doc=1,freq=2.0), product of:
              0.20021166 = queryWeight, product of:
                2.2441614 = boost
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.015873484 = queryNorm
              0.7451596 = fieldWeight in 1, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
          0.22765435 = weight(abstract_txt:classifier in 1) [ClassicSimilarity], result of:
            0.22765435 = score(doc=1,freq=1.0), product of:
              0.33434197 = queryWeight, product of:
                2.9000459 = boost
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.015873484 = queryNorm
              0.6809027 = fieldWeight in 1, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.09375 = fieldNorm(doc=1)
        0.32 = coord(8/25)
    
  3. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.22
    0.22268741 = sum of:
      0.22268741 = product of:
        0.6958982 = sum of:
          0.061464358 = weight(abstract_txt:inductive in 4450) [ClassicSimilarity], result of:
            0.061464358 = score(doc=4450,freq=1.0), product of:
              0.12689453 = queryWeight, product of:
                1.0315024 = boost
                7.749977 = idf(docFreq=50, maxDocs=43556)
                0.015873484 = queryNorm
              0.48437357 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.749977 = idf(docFreq=50, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.021593198 = weight(abstract_txt:general in 4450) [ClassicSimilarity], result of:
            0.021593198 = score(doc=4450,freq=1.0), product of:
              0.079600416 = queryWeight, product of:
                1.155371 = boost
                4.3403187 = idf(docFreq=1542, maxDocs=43556)
                0.015873484 = queryNorm
              0.27126992 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3403187 = idf(docFreq=1542, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.055762447 = weight(abstract_txt:machine in 4450) [ClassicSimilarity], result of:
            0.055762447 = score(doc=4450,freq=2.0), product of:
              0.118920095 = queryWeight, product of:
                1.4121844 = boost
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.015873484 = queryNorm
              0.46890685 = fieldWeight in 4450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.0695929 = weight(abstract_txt:text in 4450) [ClassicSimilarity], result of:
            0.0695929 = score(doc=4450,freq=7.0), product of:
              0.10393099 = queryWeight, product of:
                1.6168954 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.015873484 = queryNorm
              0.66960686 = fieldWeight in 4450, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.055434532 = weight(abstract_txt:documents in 4450) [ClassicSimilarity], result of:
            0.055434532 = score(doc=4450,freq=4.0), product of:
              0.10762207 = queryWeight, product of:
                1.6453568 = boost
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.015873484 = queryNorm
              0.51508516 = fieldWeight in 4450, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.12276193 = weight(abstract_txt:learning in 4450) [ClassicSimilarity], result of:
            0.12276193 = score(doc=4450,freq=8.0), product of:
              0.14512657 = queryWeight, product of:
                1.9106576 = boost
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.015873484 = queryNorm
              0.84589565 = fieldWeight in 4450, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.09465422 = weight(abstract_txt:paradigm in 4450) [ClassicSimilarity], result of:
            0.09465422 = score(doc=4450,freq=1.0), product of:
              0.24405877 = queryWeight, product of:
                2.477745 = boost
                6.2053394 = idf(docFreq=238, maxDocs=43556)
                0.015873484 = queryNorm
              0.3878337 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2053394 = idf(docFreq=238, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
          0.21463458 = weight(abstract_txt:classifier in 4450) [ClassicSimilarity], result of:
            0.21463458 = score(doc=4450,freq=2.0), product of:
              0.33434197 = queryWeight, product of:
                2.9000459 = boost
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.015873484 = queryNorm
              0.6419612 = fieldWeight in 4450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.0625 = fieldNorm(doc=4450)
        0.32 = coord(8/25)
    
  4. Li, T.; Zhu, S.; Ogihara, M.: Hierarchical document classification using automatically generated hierarchy (2007) 0.17
    0.1742699 = sum of:
      0.1742699 = product of:
        0.54459345 = sum of:
          0.017500991 = weight(abstract_txt:approach in 1795) [ClassicSimilarity], result of:
            0.017500991 = score(doc=1795,freq=1.0), product of:
              0.05963105 = queryWeight, product of:
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.015873484 = queryNorm
              0.2934879 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7566452 = idf(docFreq=2765, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.11435474 = weight(abstract_txt:witnessed in 1795) [ClassicSimilarity], result of:
            0.11435474 = score(doc=1795,freq=1.0), product of:
              0.1654208 = queryWeight, product of:
                1.1777248 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.015873484 = queryNorm
              0.691296 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.13207622 = weight(abstract_txt:booming in 1795) [ClassicSimilarity], result of:
            0.13207622 = score(doc=1795,freq=1.0), product of:
              0.1820974 = queryWeight, product of:
                1.2356647 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.015873484 = queryNorm
              0.7253053 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.065305546 = weight(abstract_txt:categories in 1795) [ClassicSimilarity], result of:
            0.065305546 = score(doc=1795,freq=2.0), product of:
              0.11386425 = queryWeight, product of:
                1.381839 = boost
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.015873484 = queryNorm
              0.5735386 = fieldWeight in 1795, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.046299867 = weight(abstract_txt:automatic in 1795) [ClassicSimilarity], result of:
            0.046299867 = score(doc=1795,freq=1.0), product of:
              0.1140645 = queryWeight, product of:
                1.3830537 = boost
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.015873484 = queryNorm
              0.40590954 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.046498716 = weight(abstract_txt:text in 1795) [ClassicSimilarity], result of:
            0.046498716 = score(doc=1795,freq=2.0), product of:
              0.10393099 = queryWeight, product of:
                1.6168954 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.015873484 = queryNorm
              0.4473999 = fieldWeight in 1795, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.03464658 = weight(abstract_txt:documents in 1795) [ClassicSimilarity], result of:
            0.03464658 = score(doc=1795,freq=1.0), product of:
              0.10762207 = queryWeight, product of:
                1.6453568 = boost
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.015873484 = queryNorm
              0.32192823 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1206813 = idf(docFreq=1921, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
          0.08791084 = weight(abstract_txt:automated in 1795) [ClassicSimilarity], result of:
            0.08791084 = score(doc=1795,freq=1.0), product of:
              0.20021166 = queryWeight, product of:
                2.2441614 = boost
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.015873484 = queryNorm
              0.4390895 = fieldWeight in 1795, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6203456 = idf(docFreq=428, maxDocs=43556)
                0.078125 = fieldNorm(doc=1795)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.14
    0.13683392 = sum of:
      0.13683392 = product of:
        0.5701414 = sum of:
          0.055413593 = weight(abstract_txt:categories in 2593) [ClassicSimilarity], result of:
            0.055413593 = score(doc=2593,freq=1.0), product of:
              0.11386425 = queryWeight, product of:
                1.381839 = boost
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.015873484 = queryNorm
              0.48666367 = fieldWeight in 2593, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.191079 = idf(docFreq=658, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
          0.05555984 = weight(abstract_txt:automatic in 2593) [ClassicSimilarity], result of:
            0.05555984 = score(doc=2593,freq=1.0), product of:
              0.1140645 = queryWeight, product of:
                1.3830537 = boost
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.015873484 = queryNorm
              0.48709142 = fieldWeight in 2593, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
          0.08364367 = weight(abstract_txt:machine in 2593) [ClassicSimilarity], result of:
            0.08364367 = score(doc=2593,freq=2.0), product of:
              0.118920095 = queryWeight, product of:
                1.4121844 = boost
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.015873484 = queryNorm
              0.70336026 = fieldWeight in 2593, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3050756 = idf(docFreq=587, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
          0.055798456 = weight(abstract_txt:text in 2593) [ClassicSimilarity], result of:
            0.055798456 = score(doc=2593,freq=2.0), product of:
              0.10393099 = queryWeight, product of:
                1.6168954 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.015873484 = queryNorm
              0.5368799 = fieldWeight in 2593, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
          0.092071444 = weight(abstract_txt:learning in 2593) [ClassicSimilarity], result of:
            0.092071444 = score(doc=2593,freq=2.0), product of:
              0.14512657 = queryWeight, product of:
                1.9106576 = boost
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.015873484 = queryNorm
              0.6344217 = fieldWeight in 2593, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7851086 = idf(docFreq=988, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
          0.22765435 = weight(abstract_txt:classifier in 2593) [ClassicSimilarity], result of:
            0.22765435 = score(doc=2593,freq=1.0), product of:
              0.33434197 = queryWeight, product of:
                2.9000459 = boost
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.015873484 = queryNorm
              0.6809027 = fieldWeight in 2593, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2629623 = idf(docFreq=82, maxDocs=43556)
                0.09375 = fieldNorm(doc=2593)
        0.24 = coord(6/25)