Document (#28391)

Author
Sebastiani, F.
Title
¬A tutorial an automated text categorisation
Source
http://net.pku.edu.cn/~webg/papers/sebastiani99tutorial.pdf
Year
1999
Abstract
The automated categorisation (or classification) of texts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, i.e. manually building a set of rules encoding expert knowledge an how to classify documents. In the '90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest. A newer paradigm based an machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence. In this tutorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.
Content
Aus: Proceedings of THAI-99, European Symposium on Telematics, Hypermedia and Artificial Intelligence
Theme
Automatisches Klassifizieren
Computerlinguistik

Similar documents (author)

  1. Sebastiani, F.: On the role of logic in information retrieval (1998) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 1140) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 1140, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=1140)
    
  2. Sebastiani, F.: Machine learning in automated text categorization (2002) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 3389) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 3389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=3389)
    
  3. Sebastiani, F.: Classification of text, automatic (2006) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 5003) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 5003, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=5003)
    
  4. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:sebastiani in 3456) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 3456, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=3456)
    
  5. Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:sebastiani in 5172) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 5172, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=5172)
    

Similar documents (content)

  1. Sebastiani, F.: Machine learning in automated text categorization (2002) 0.93
    0.9339358 = sum of:
      0.9339358 = product of:
        1.4592748 = sum of:
          0.030072706 = weight(abstract_txt:approach in 3389) [ClassicSimilarity], result of:
            0.030072706 = score(doc=3389,freq=3.0), product of:
              0.05933788 = queryWeight, product of:
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015843173 = queryNorm
              0.5068045 = fieldWeight in 3389, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.07622139 = weight(abstract_txt:inductive in 3389) [ClassicSimilarity], result of:
            0.07622139 = score(doc=3389,freq=1.0), product of:
              0.12626956 = queryWeight, product of:
                1.0314978 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.015843173 = queryNorm
              0.60364026 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.024411151 = weight(abstract_txt:within in 3389) [ClassicSimilarity], result of:
            0.024411151 = score(doc=3389,freq=1.0), product of:
              0.074470274 = queryWeight, product of:
                1.120277 = boost
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.015843173 = queryNorm
              0.32779726 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.105315626 = weight(abstract_txt:savings in 3389) [ClassicSimilarity], result of:
            0.105315626 = score(doc=3389,freq=1.0), product of:
              0.15664239 = queryWeight, product of:
                1.1488773 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.015843173 = queryNorm
              0.6723316 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.026919844 = weight(abstract_txt:general in 3389) [ClassicSimilarity], result of:
            0.026919844 = score(doc=3389,freq=1.0), product of:
              0.07948877 = queryWeight, product of:
                1.1574091 = boost
                4.3348765 = idf(docFreq=1574, maxDocs=44218)
                0.015843173 = queryNorm
              0.33866224 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3348765 = idf(docFreq=1574, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.11285601 = weight(abstract_txt:witnessed in 3389) [ClassicSimilarity], result of:
            0.11285601 = score(doc=3389,freq=1.0), product of:
              0.16403274 = queryWeight, product of:
                1.1756668 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.015843173 = queryNorm
              0.688009 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.13286789 = weight(abstract_txt:booming in 3389) [ClassicSimilarity], result of:
            0.13286789 = score(doc=3389,freq=1.0), product of:
              0.18289174 = queryWeight, product of:
                1.2414123 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.015843173 = queryNorm
              0.72648376 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.06487511 = weight(abstract_txt:categories in 3389) [ClassicSimilarity], result of:
            0.06487511 = score(doc=3389,freq=2.0), product of:
              0.11340516 = queryWeight, product of:
                1.3824531 = boost
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.015843173 = queryNorm
              0.5720649 = fieldWeight in 3389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.06873762 = weight(abstract_txt:machine in 3389) [ClassicSimilarity], result of:
            0.06873762 = score(doc=3389,freq=2.0), product of:
              0.11786289 = queryWeight, product of:
                1.4093618 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.015843173 = queryNorm
              0.58319986 = fieldWeight in 3389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.061783876 = weight(abstract_txt:expert in 3389) [ClassicSimilarity], result of:
            0.061783876 = score(doc=3389,freq=1.0), product of:
              0.13830595 = queryWeight, product of:
                1.5267024 = boost
                5.7180014 = idf(docFreq=394, maxDocs=44218)
                0.015843173 = queryNorm
              0.44671887 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7180014 = idf(docFreq=394, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.032781143 = weight(abstract_txt:text in 3389) [ClassicSimilarity], result of:
            0.032781143 = score(doc=3389,freq=1.0), product of:
              0.10376174 = queryWeight, product of:
                1.6195644 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015843173 = queryNorm
              0.3159271 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.049074072 = weight(abstract_txt:documents in 3389) [ClassicSimilarity], result of:
            0.049074072 = score(doc=3389,freq=2.0), product of:
              0.10777365 = queryWeight, product of:
                1.6505774 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.015843173 = queryNorm
              0.4553439 = fieldWeight in 3389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.09206923 = weight(abstract_txt:learning in 3389) [ClassicSimilarity], result of:
            0.09206923 = score(doc=3389,freq=3.0), product of:
              0.14321564 = queryWeight, product of:
                1.9027197 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.015843173 = queryNorm
              0.6428713 = fieldWeight in 3389, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.08721059 = weight(abstract_txt:automated in 3389) [ClassicSimilarity], result of:
            0.08721059 = score(doc=3389,freq=1.0), product of:
              0.19922048 = queryWeight, product of:
                2.2441227 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.015843173 = queryNorm
              0.43775916 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.11743 = weight(abstract_txt:paradigm in 3389) [ClassicSimilarity], result of:
            0.11743 = score(doc=3389,freq=1.0), product of:
              0.24292593 = queryWeight, product of:
                2.478087 = boost
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.015843173 = queryNorm
              0.48339838 = fieldWeight in 3389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
          0.37664855 = weight(abstract_txt:classifier in 3389) [ClassicSimilarity], result of:
            0.37664855 = score(doc=3389,freq=4.0), product of:
              0.33283222 = queryWeight, product of:
                2.9006298 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015843173 = queryNorm
              1.1316469 = fieldWeight in 3389, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.078125 = fieldNorm(doc=3389)
        0.64 = coord(16/25)
    
  2. Sebastiani, F.: Classification of text, automatic (2006) 0.23
    0.22593282 = sum of:
      0.22593282 = product of:
        0.7060401 = sum of:
          0.020834984 = weight(abstract_txt:approach in 5003) [ClassicSimilarity], result of:
            0.020834984 = score(doc=5003,freq=1.0), product of:
              0.05933788 = queryWeight, product of:
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015843173 = queryNorm
              0.3511245 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.07785013 = weight(abstract_txt:categories in 5003) [ClassicSimilarity], result of:
            0.07785013 = score(doc=5003,freq=2.0), product of:
              0.11340516 = queryWeight, product of:
                1.3824531 = boost
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.015843173 = queryNorm
              0.68647784 = fieldWeight in 5003, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.05561989 = weight(abstract_txt:automatic in 5003) [ClassicSimilarity], result of:
            0.05561989 = score(doc=5003,freq=1.0), product of:
              0.11418876 = queryWeight, product of:
                1.387221 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015843173 = queryNorm
              0.48708728 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.058325805 = weight(abstract_txt:machine in 5003) [ClassicSimilarity], result of:
            0.058325805 = score(doc=5003,freq=1.0), product of:
              0.11786289 = queryWeight, product of:
                1.4093618 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.015843173 = queryNorm
              0.49486148 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.055631448 = weight(abstract_txt:text in 5003) [ClassicSimilarity], result of:
            0.055631448 = score(doc=5003,freq=2.0), product of:
              0.10376174 = queryWeight, product of:
                1.6195644 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015843173 = queryNorm
              0.53614604 = fieldWeight in 5003, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.06378744 = weight(abstract_txt:learning in 5003) [ClassicSimilarity], result of:
            0.06378744 = score(doc=5003,freq=1.0), product of:
              0.14321564 = queryWeight, product of:
                1.9027197 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.015843173 = queryNorm
              0.44539434 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.14800128 = weight(abstract_txt:automated in 5003) [ClassicSimilarity], result of:
            0.14800128 = score(doc=5003,freq=2.0), product of:
              0.19922048 = queryWeight, product of:
                2.2441227 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.015843173 = queryNorm
              0.7429019 = fieldWeight in 5003, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.22598912 = weight(abstract_txt:classifier in 5003) [ClassicSimilarity], result of:
            0.22598912 = score(doc=5003,freq=1.0), product of:
              0.33283222 = queryWeight, product of:
                2.9006298 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015843173 = queryNorm
              0.6789881 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
        0.32 = coord(8/25)
    
  3. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.22
    0.22070272 = sum of:
      0.22070272 = product of:
        0.689696 = sum of:
          0.060977116 = weight(abstract_txt:inductive in 2452) [ClassicSimilarity], result of:
            0.060977116 = score(doc=2452,freq=1.0), product of:
              0.12626956 = queryWeight, product of:
                1.0314978 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.015843173 = queryNorm
              0.4829122 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.021535875 = weight(abstract_txt:general in 2452) [ClassicSimilarity], result of:
            0.021535875 = score(doc=2452,freq=1.0), product of:
              0.07948877 = queryWeight, product of:
                1.1574091 = boost
                4.3348765 = idf(docFreq=1574, maxDocs=44218)
                0.015843173 = queryNorm
              0.27092978 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3348765 = idf(docFreq=1574, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.054990094 = weight(abstract_txt:machine in 2452) [ClassicSimilarity], result of:
            0.054990094 = score(doc=2452,freq=2.0), product of:
              0.11786289 = queryWeight, product of:
                1.4093618 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.015843173 = queryNorm
              0.4665599 = fieldWeight in 2452, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.069384605 = weight(abstract_txt:text in 2452) [ClassicSimilarity], result of:
            0.069384605 = score(doc=2452,freq=7.0), product of:
              0.10376174 = queryWeight, product of:
                1.6195644 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015843173 = queryNorm
              0.6686916 = fieldWeight in 2452, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.055520978 = weight(abstract_txt:documents in 2452) [ClassicSimilarity], result of:
            0.055520978 = score(doc=2452,freq=4.0), product of:
              0.10777365 = queryWeight, product of:
                1.6505774 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.015843173 = queryNorm
              0.5151628 = fieldWeight in 2452, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.120278746 = weight(abstract_txt:learning in 2452) [ClassicSimilarity], result of:
            0.120278746 = score(doc=2452,freq=8.0), product of:
              0.14321564 = queryWeight, product of:
                1.9027197 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.015843173 = queryNorm
              0.83984363 = fieldWeight in 2452, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.093944 = weight(abstract_txt:paradigm in 2452) [ClassicSimilarity], result of:
            0.093944 = score(doc=2452,freq=1.0), product of:
              0.24292593 = queryWeight, product of:
                2.478087 = boost
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.015843173 = queryNorm
              0.3867187 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.2130646 = weight(abstract_txt:classifier in 2452) [ClassicSimilarity], result of:
            0.2130646 = score(doc=2452,freq=2.0), product of:
              0.33283222 = queryWeight, product of:
                2.9006298 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015843173 = queryNorm
              0.64015615 = fieldWeight in 2452, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
        0.32 = coord(8/25)
    
  4. Li, T.; Zhu, S.; Ogihara, M.: Hierarchical document classification using automatically generated hierarchy (2007) 0.17
    0.17362629 = sum of:
      0.17362629 = product of:
        0.54258215 = sum of:
          0.017362485 = weight(abstract_txt:approach in 4797) [ClassicSimilarity], result of:
            0.017362485 = score(doc=4797,freq=1.0), product of:
              0.05933788 = queryWeight, product of:
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015843173 = queryNorm
              0.29260373 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.11285601 = weight(abstract_txt:witnessed in 4797) [ClassicSimilarity], result of:
            0.11285601 = score(doc=4797,freq=1.0), product of:
              0.16403274 = queryWeight, product of:
                1.1756668 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.015843173 = queryNorm
              0.688009 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.13286789 = weight(abstract_txt:booming in 4797) [ClassicSimilarity], result of:
            0.13286789 = score(doc=4797,freq=1.0), product of:
              0.18289174 = queryWeight, product of:
                1.2414123 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.015843173 = queryNorm
              0.72648376 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.06487511 = weight(abstract_txt:categories in 4797) [ClassicSimilarity], result of:
            0.06487511 = score(doc=4797,freq=2.0), product of:
              0.11340516 = queryWeight, product of:
                1.3824531 = boost
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.015843173 = queryNorm
              0.5720649 = fieldWeight in 4797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.046349913 = weight(abstract_txt:automatic in 4797) [ClassicSimilarity], result of:
            0.046349913 = score(doc=4797,freq=1.0), product of:
              0.11418876 = queryWeight, product of:
                1.387221 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015843173 = queryNorm
              0.40590608 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.04635954 = weight(abstract_txt:text in 4797) [ClassicSimilarity], result of:
            0.04635954 = score(doc=4797,freq=2.0), product of:
              0.10376174 = queryWeight, product of:
                1.6195644 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015843173 = queryNorm
              0.44678837 = fieldWeight in 4797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.034700613 = weight(abstract_txt:documents in 4797) [ClassicSimilarity], result of:
            0.034700613 = score(doc=4797,freq=1.0), product of:
              0.10777365 = queryWeight, product of:
                1.6505774 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.015843173 = queryNorm
              0.32197678 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
          0.08721059 = weight(abstract_txt:automated in 4797) [ClassicSimilarity], result of:
            0.08721059 = score(doc=4797,freq=1.0), product of:
              0.19922048 = queryWeight, product of:
                2.2441227 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.015843173 = queryNorm
              0.43775916 = fieldWeight in 4797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.078125 = fieldNorm(doc=4797)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.14
    0.13559592 = sum of:
      0.13559592 = product of:
        0.564983 = sum of:
          0.055048354 = weight(abstract_txt:categories in 1595) [ClassicSimilarity], result of:
            0.055048354 = score(doc=1595,freq=1.0), product of:
              0.11340516 = queryWeight, product of:
                1.3824531 = boost
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.015843173 = queryNorm
              0.48541313 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.05561989 = weight(abstract_txt:automatic in 1595) [ClassicSimilarity], result of:
            0.05561989 = score(doc=1595,freq=1.0), product of:
              0.11418876 = queryWeight, product of:
                1.387221 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015843173 = queryNorm
              0.48708728 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.08248515 = weight(abstract_txt:machine in 1595) [ClassicSimilarity], result of:
            0.08248515 = score(doc=1595,freq=2.0), product of:
              0.11786289 = queryWeight, product of:
                1.4093618 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.015843173 = queryNorm
              0.69983983 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.055631448 = weight(abstract_txt:text in 1595) [ClassicSimilarity], result of:
            0.055631448 = score(doc=1595,freq=2.0), product of:
              0.10376174 = queryWeight, product of:
                1.6195644 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015843173 = queryNorm
              0.53614604 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.09020905 = weight(abstract_txt:learning in 1595) [ClassicSimilarity], result of:
            0.09020905 = score(doc=1595,freq=2.0), product of:
              0.14321564 = queryWeight, product of:
                1.9027197 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.015843173 = queryNorm
              0.6298827 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.22598912 = weight(abstract_txt:classifier in 1595) [ClassicSimilarity], result of:
            0.22598912 = score(doc=1595,freq=1.0), product of:
              0.33283222 = queryWeight, product of:
                2.9006298 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015843173 = queryNorm
              0.6789881 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
        0.24 = coord(6/25)