Document (#34534)

Author
Pong, J.Y.-H.
Kwok, R.C.-W.
Lau, R.Y.-K.
Hao, J.-X.
Wong, P.C.-C.
Title
¬A comparative study of two automatic document classification methods in a library setting
Source
Journal of information science. 34(2008) no.2, S.213-230
Year
2008
Abstract
In current library practice, trained human experts usually carry out document cataloguing and indexing based on a manual approach. With the explosive growth in the number of electronic documents available on the Internet and digital libraries, it is increasingly difficult for library practitioners to categorize both electronic documents and traditional library materials using just a manual approach. To improve the effectiveness and efficiency of document categorization at the library setting, more in-depth studies of using automatic document classification methods to categorize library items are required. Machine learning research has advanced rapidly in recent years. However, applying machine learning techniques to improve library practice is still a relatively unexplored area. This paper illustrates the design and development of a machine learning based automatic document classification system to alleviate the manual categorization problem encountered within the library setting. Two supervised machine learning algorithms have been tested. Our empirical tests show that supervised machine learning algorithms in general, and the k-nearest neighbours (KNN) algorithm in particular, can be used to develop an effective document classification system to enhance current library practice. Moreover, some concrete recommendations regarding how to practically apply the KNN algorithm to develop automatic document classification in a library setting are made. To our best knowledge, this is the first in-depth study of applying the KNN algorithm to automatic document classification based on the widely used LCC classification scheme adopted by many large libraries.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Wu, H.C.; Luk, R.W.P.; Wong, K.F,; Kwok, K.L.: ¬A retrospective study of a hybrid document-context based retrieval model (2007) 3.79
    3.7856379 = sum of:
      3.7856379 = sum of:
        1.6429775 = weight(author_txt:wong in 2937) [ClassicSimilarity], result of:
          1.6429775 = score(doc=2937,freq=1.0), product of:
            0.64218414 = queryWeight, product of:
              8.186948 = idf(docFreq=31, maxDocs=42306)
              0.078439996 = queryNorm
            2.5584211 = fieldWeight in 2937, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.186948 = idf(docFreq=31, maxDocs=42306)
              0.3125 = fieldNorm(doc=2937)
        2.1426604 = weight(author_txt:kwok in 2937) [ClassicSimilarity], result of:
          2.1426604 = score(doc=2937,freq=1.0), product of:
            0.7665504 = queryWeight, product of:
              1.0925481 = boost
              8.944634 = idf(docFreq=14, maxDocs=42306)
              0.078439996 = queryNorm
            2.7951982 = fieldWeight in 2937, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.944634 = idf(docFreq=14, maxDocs=42306)
              0.3125 = fieldNorm(doc=2937)
    
  2. Kwok, K.L.: ¬The use of titles and cited titles as document representations for automatic classification (1975) 2.14
    2.1426604 = sum of:
      2.1426604 = product of:
        4.2853208 = sum of:
          4.2853208 = weight(author_txt:kwok in 4347) [ClassicSimilarity], result of:
            4.2853208 = score(doc=4347,freq=1.0), product of:
              0.7665504 = queryWeight, product of:
                1.0925481 = boost
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.078439996 = queryNorm
              5.5903964 = fieldWeight in 4347, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.625 = fieldNorm(doc=4347)
        0.5 = coord(1/2)
    
  3. Kwok, K.L.: Employing multiple representations for Chinese information retrieval (1999) 2.14
    2.1426604 = sum of:
      2.1426604 = product of:
        4.2853208 = sum of:
          4.2853208 = weight(author_txt:kwok in 4774) [ClassicSimilarity], result of:
            4.2853208 = score(doc=4774,freq=1.0), product of:
              0.7665504 = queryWeight, product of:
                1.0925481 = boost
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.078439996 = queryNorm
              5.5903964 = fieldWeight in 4774, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.625 = fieldNorm(doc=4774)
        0.5 = coord(1/2)
    
  4. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 2.14
    2.1426604 = sum of:
      2.1426604 = product of:
        4.2853208 = sum of:
          4.2853208 = weight(author_txt:kwok in 697) [ClassicSimilarity], result of:
            4.2853208 = score(doc=697,freq=1.0), product of:
              0.7665504 = queryWeight, product of:
                1.0925481 = boost
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.078439996 = queryNorm
              5.5903964 = fieldWeight in 697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.625 = fieldNorm(doc=697)
        0.5 = coord(1/2)
    
  5. Kwok, K.L.: Improving English and Chinese ad-hoc retrieval : a TIPSTER text phase 3 project report (2000) 2.14
    2.1426604 = sum of:
      2.1426604 = product of:
        4.2853208 = sum of:
          4.2853208 = weight(author_txt:kwok in 389) [ClassicSimilarity], result of:
            4.2853208 = score(doc=389,freq=1.0), product of:
              0.7665504 = queryWeight, product of:
                1.0925481 = boost
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.078439996 = queryNorm
              5.5903964 = fieldWeight in 389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.944634 = idf(docFreq=14, maxDocs=42306)
                0.625 = fieldNorm(doc=389)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.41
    0.41285363 = sum of:
      0.41285363 = product of:
        0.9383037 = sum of:
          0.03303197 = weight(abstract_txt:improve in 173) [ClassicSimilarity], result of:
            0.03303197 = score(doc=173,freq=1.0), product of:
              0.105487175 = queryWeight, product of:
                1.198233 = boost
                5.010197 = idf(docFreq=766, maxDocs=42306)
                0.017571287 = queryNorm
              0.31313732 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.010197 = idf(docFreq=766, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.05849958 = weight(abstract_txt:depth in 173) [ClassicSimilarity], result of:
            0.05849958 = score(doc=173,freq=1.0), product of:
              0.15441109 = queryWeight, product of:
                1.4497086 = boost
                6.061697 = idf(docFreq=267, maxDocs=42306)
                0.017571287 = queryNorm
              0.37885606 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.061697 = idf(docFreq=267, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.0776664 = weight(abstract_txt:categorization in 173) [ClassicSimilarity], result of:
            0.0776664 = score(doc=173,freq=1.0), product of:
              0.18652289 = queryWeight, product of:
                1.5933365 = boost
                6.6622515 = idf(docFreq=146, maxDocs=42306)
                0.017571287 = queryNorm
              0.41639072 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6622515 = idf(docFreq=146, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.116543524 = weight(abstract_txt:supervised in 173) [ClassicSimilarity], result of:
            0.116543524 = score(doc=173,freq=1.0), product of:
              0.24447556 = queryWeight, product of:
                1.8241442 = boost
                7.6273327 = idf(docFreq=55, maxDocs=42306)
                0.017571287 = queryNorm
              0.4767083 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6273327 = idf(docFreq=55, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.07405729 = weight(abstract_txt:algorithm in 173) [ClassicSimilarity], result of:
            0.07405729 = score(doc=173,freq=1.0), product of:
              0.20684847 = queryWeight, product of:
                2.0550067 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.017571287 = queryNorm
              0.35802677 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.073413365 = weight(abstract_txt:learning in 173) [ClassicSimilarity], result of:
            0.073413365 = score(doc=173,freq=1.0), product of:
              0.2438223 = queryWeight, product of:
                2.8803692 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.017571287 = queryNorm
              0.30109373 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.09210774 = weight(abstract_txt:automatic in 173) [ClassicSimilarity], result of:
            0.09210774 = score(doc=173,freq=1.0), product of:
              0.28363127 = queryWeight, product of:
                3.1066227 = boost
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.017571287 = queryNorm
              0.32474467 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.100345194 = weight(abstract_txt:machine in 173) [ClassicSimilarity], result of:
            0.100345194 = score(doc=173,freq=1.0), product of:
              0.30029935 = queryWeight, product of:
                3.1966026 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.017571287 = queryNorm
              0.33415055 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.15656468 = weight(abstract_txt:classification in 173) [ClassicSimilarity], result of:
            0.15656468 = score(doc=173,freq=7.0), product of:
              0.23624496 = queryWeight, product of:
                3.3547237 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.017571287 = queryNorm
              0.6627218 = fieldWeight in 173, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.07374522 = weight(abstract_txt:library in 173) [ClassicSimilarity], result of:
            0.07374522 = score(doc=173,freq=3.0), product of:
              0.21363981 = queryWeight, product of:
                3.8130064 = boost
                3.188681 = idf(docFreq=4740, maxDocs=42306)
                0.017571287 = queryNorm
              0.34518483 = fieldWeight in 173, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.188681 = idf(docFreq=4740, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
          0.0823287 = weight(abstract_txt:document in 173) [ClassicSimilarity], result of:
            0.0823287 = score(doc=173,freq=1.0), product of:
              0.30782047 = queryWeight, product of:
                4.0937395 = boost
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.017571287 = queryNorm
              0.26745686 = fieldWeight in 173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.0625 = fieldNorm(doc=173)
        0.44 = coord(11/25)
    
  2. Dietterich, T.G.: Machine-learning research : four current directions (1997) 0.37
    0.371511 = sum of:
      0.371511 = product of:
        1.326825 = sum of:
          0.04800104 = weight(abstract_txt:methods in 4322) [ClassicSimilarity], result of:
            0.04800104 = score(doc=4322,freq=1.0), product of:
              0.073471196 = queryWeight, product of:
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.017571287 = queryNorm
              0.6533314 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.052900776 = weight(abstract_txt:current in 4322) [ClassicSimilarity], result of:
            0.052900776 = score(doc=4322,freq=1.0), product of:
              0.078389525 = queryWeight, product of:
                1.032929 = boost
                4.319008 = idf(docFreq=1530, maxDocs=42306)
                0.017571287 = queryNorm
              0.674845 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.319008 = idf(docFreq=1530, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.12536858 = weight(abstract_txt:algorithms in 4322) [ClassicSimilarity], result of:
            0.12536858 = score(doc=4322,freq=1.0), product of:
              0.13934 = queryWeight, product of:
                1.377144 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.017571287 = queryNorm
              0.89973146 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.29135883 = weight(abstract_txt:supervised in 4322) [ClassicSimilarity], result of:
            0.29135883 = score(doc=4322,freq=1.0), product of:
              0.24447556 = queryWeight, product of:
                1.8241442 = boost
                7.6273327 = idf(docFreq=55, maxDocs=42306)
                0.017571287 = queryNorm
              1.1917708 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6273327 = idf(docFreq=55, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.41039318 = weight(abstract_txt:learning in 4322) [ClassicSimilarity], result of:
            0.41039318 = score(doc=4322,freq=5.0), product of:
              0.2438223 = queryWeight, product of:
                2.8803692 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.017571287 = queryNorm
              1.6831651 = fieldWeight in 4322, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.250863 = weight(abstract_txt:machine in 4322) [ClassicSimilarity], result of:
            0.250863 = score(doc=4322,freq=1.0), product of:
              0.30029935 = queryWeight, product of:
                3.1966026 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.017571287 = queryNorm
              0.8353764 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
          0.14793973 = weight(abstract_txt:classification in 4322) [ClassicSimilarity], result of:
            0.14793973 = score(doc=4322,freq=1.0), product of:
              0.23624496 = queryWeight, product of:
                3.3547237 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.017571287 = queryNorm
              0.62621325 = fieldWeight in 4322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.15625 = fieldNorm(doc=4322)
        0.28 = coord(7/25)
    
  3. Li, Y.; Shawe-Taylor, J.: Advanced learning algorithms for cross-language patent retrieval and classification (2007) 0.28
    0.2835223 = sum of:
      0.2835223 = product of:
        0.88600725 = sum of:
          0.02400052 = weight(abstract_txt:methods in 2932) [ClassicSimilarity], result of:
            0.02400052 = score(doc=2932,freq=1.0), product of:
              0.073471196 = queryWeight, product of:
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.017571287 = queryNorm
              0.3266657 = fieldWeight in 2932, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.01636774 = weight(abstract_txt:based in 2932) [ClassicSimilarity], result of:
            0.01636774 = score(doc=2932,freq=1.0), product of:
              0.06516178 = queryWeight, product of:
                1.1534096 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.017571287 = queryNorm
              0.25118622 = fieldWeight in 2932, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.12536858 = weight(abstract_txt:algorithms in 2932) [ClassicSimilarity], result of:
            0.12536858 = score(doc=2932,freq=4.0), product of:
              0.13934 = queryWeight, product of:
                1.377144 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.017571287 = queryNorm
              0.89973146 = fieldWeight in 2932, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.09257161 = weight(abstract_txt:algorithm in 2932) [ClassicSimilarity], result of:
            0.09257161 = score(doc=2932,freq=1.0), product of:
              0.20684847 = queryWeight, product of:
                2.0550067 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.017571287 = queryNorm
              0.44753346 = fieldWeight in 2932, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.24279189 = weight(abstract_txt:learning in 2932) [ClassicSimilarity], result of:
            0.24279189 = score(doc=2932,freq=7.0), product of:
              0.2438223 = queryWeight, product of:
                2.8803692 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.017571287 = queryNorm
              0.9957739 = fieldWeight in 2932, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.17738691 = weight(abstract_txt:machine in 2932) [ClassicSimilarity], result of:
            0.17738691 = score(doc=2932,freq=2.0), product of:
              0.30029935 = queryWeight, product of:
                3.1966026 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.017571287 = queryNorm
              0.59070027 = fieldWeight in 2932, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.104609184 = weight(abstract_txt:classification in 2932) [ClassicSimilarity], result of:
            0.104609184 = score(doc=2932,freq=2.0), product of:
              0.23624496 = queryWeight, product of:
                3.3547237 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.017571287 = queryNorm
              0.44279963 = fieldWeight in 2932, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
          0.10291087 = weight(abstract_txt:document in 2932) [ClassicSimilarity], result of:
            0.10291087 = score(doc=2932,freq=1.0), product of:
              0.30782047 = queryWeight, product of:
                4.0937395 = boost
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.017571287 = queryNorm
              0.33432108 = fieldWeight in 2932, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.078125 = fieldNorm(doc=2932)
        0.32 = coord(8/25)
    
  4. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.25
    0.2497636 = sum of:
      0.2497636 = product of:
        0.78051126 = sum of:
          0.041570123 = weight(abstract_txt:methods in 481) [ClassicSimilarity], result of:
            0.041570123 = score(doc=481,freq=3.0), product of:
              0.073471196 = queryWeight, product of:
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.017571287 = queryNorm
              0.5658016 = fieldWeight in 481, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.01636774 = weight(abstract_txt:based in 481) [ClassicSimilarity], result of:
            0.01636774 = score(doc=481,freq=1.0), product of:
              0.06516178 = queryWeight, product of:
                1.1534096 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.017571287 = queryNorm
              0.25118622 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.041289963 = weight(abstract_txt:improve in 481) [ClassicSimilarity], result of:
            0.041289963 = score(doc=481,freq=1.0), product of:
              0.105487175 = queryWeight, product of:
                1.198233 = boost
                5.010197 = idf(docFreq=766, maxDocs=42306)
                0.017571287 = queryNorm
              0.39142165 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.010197 = idf(docFreq=766, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.12977771 = weight(abstract_txt:learning in 481) [ClassicSimilarity], result of:
            0.12977771 = score(doc=481,freq=2.0), product of:
              0.2438223 = queryWeight, product of:
                2.8803692 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.017571287 = queryNorm
              0.5322635 = fieldWeight in 481, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.11513468 = weight(abstract_txt:automatic in 481) [ClassicSimilarity], result of:
            0.11513468 = score(doc=481,freq=1.0), product of:
              0.28363127 = queryWeight, product of:
                3.1066227 = boost
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.017571287 = queryNorm
              0.40593085 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.1254315 = weight(abstract_txt:machine in 481) [ClassicSimilarity], result of:
            0.1254315 = score(doc=481,freq=1.0), product of:
              0.30029935 = queryWeight, product of:
                3.1966026 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.017571287 = queryNorm
              0.4176882 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.16540165 = weight(abstract_txt:classification in 481) [ClassicSimilarity], result of:
            0.16540165 = score(doc=481,freq=5.0), product of:
              0.23624496 = queryWeight, product of:
                3.3547237 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.017571287 = queryNorm
              0.7001277 = fieldWeight in 481, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
          0.14553794 = weight(abstract_txt:document in 481) [ClassicSimilarity], result of:
            0.14553794 = score(doc=481,freq=2.0), product of:
              0.30782047 = queryWeight, product of:
                4.0937395 = boost
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.017571287 = queryNorm
              0.4728014 = fieldWeight in 481, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2793097 = idf(docFreq=1592, maxDocs=42306)
                0.078125 = fieldNorm(doc=481)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.25
    0.24557556 = sum of:
      0.24557556 = product of:
        0.8770556 = sum of:
          0.028800625 = weight(abstract_txt:methods in 2596) [ClassicSimilarity], result of:
            0.028800625 = score(doc=2596,freq=1.0), product of:
              0.073471196 = queryWeight, product of:
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.017571287 = queryNorm
              0.39199886 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.181321 = idf(docFreq=1756, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.019641291 = weight(abstract_txt:based in 2596) [ClassicSimilarity], result of:
            0.019641291 = score(doc=2596,freq=1.0), product of:
              0.06516178 = queryWeight, product of:
                1.1534096 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.017571287 = queryNorm
              0.3014235 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.16475531 = weight(abstract_txt:categorization in 2596) [ClassicSimilarity], result of:
            0.16475531 = score(doc=2596,freq=2.0), product of:
              0.18652289 = queryWeight, product of:
                1.5933365 = boost
                6.6622515 = idf(docFreq=146, maxDocs=42306)
                0.017571287 = queryNorm
              0.8832981 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6622515 = idf(docFreq=146, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.15709923 = weight(abstract_txt:algorithm in 2596) [ClassicSimilarity], result of:
            0.15709923 = score(doc=2596,freq=2.0), product of:
              0.20684847 = queryWeight, product of:
                2.0550067 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.017571287 = queryNorm
              0.7594895 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.15573326 = weight(abstract_txt:learning in 2596) [ClassicSimilarity], result of:
            0.15573326 = score(doc=2596,freq=2.0), product of:
              0.2438223 = queryWeight, product of:
                2.8803692 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.017571287 = queryNorm
              0.6387162 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.13816161 = weight(abstract_txt:automatic in 2596) [ClassicSimilarity], result of:
            0.13816161 = score(doc=2596,freq=1.0), product of:
              0.28363127 = queryWeight, product of:
                3.1066227 = boost
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.017571287 = queryNorm
              0.487117 = fieldWeight in 2596, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
          0.2128643 = weight(abstract_txt:machine in 2596) [ClassicSimilarity], result of:
            0.2128643 = score(doc=2596,freq=2.0), product of:
              0.30029935 = queryWeight, product of:
                3.1966026 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.017571287 = queryNorm
              0.70884037 = fieldWeight in 2596, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.09375 = fieldNorm(doc=2596)
        0.28 = coord(7/25)