Document (#34533)

Author
Pong, J.Y.-H.
Kwok, R.C.-W.
Lau, R.Y.-K.
Hao, J.-X.
Wong, P.C.-C.
Title
¬A comparative study of two automatic document classification methods in a library setting
Source
Journal of information science. 34(2008) no.2, S.213-230
Year
2008
Abstract
In current library practice, trained human experts usually carry out document cataloguing and indexing based on a manual approach. With the explosive growth in the number of electronic documents available on the Internet and digital libraries, it is increasingly difficult for library practitioners to categorize both electronic documents and traditional library materials using just a manual approach. To improve the effectiveness and efficiency of document categorization at the library setting, more in-depth studies of using automatic document classification methods to categorize library items are required. Machine learning research has advanced rapidly in recent years. However, applying machine learning techniques to improve library practice is still a relatively unexplored area. This paper illustrates the design and development of a machine learning based automatic document classification system to alleviate the manual categorization problem encountered within the library setting. Two supervised machine learning algorithms have been tested. Our empirical tests show that supervised machine learning algorithms in general, and the k-nearest neighbours (KNN) algorithm in particular, can be used to develop an effective document classification system to enhance current library practice. Moreover, some concrete recommendations regarding how to practically apply the KNN algorithm to develop automatic document classification in a library setting are made. To our best knowledge, this is the first in-depth study of applying the KNN algorithm to automatic document classification based on the widely used LCC classification scheme adopted by many large libraries.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Wu, H.C.; Luk, R.W.P.; Wong, K.F,; Kwok, K.L.: ¬A retrospective study of a hybrid document-context based retrieval model (2007) 3.80
    3.7983823 = sum of:
      3.7983823 = sum of:
        1.6393043 = weight(author_txt:wong in 936) [ClassicSimilarity], result of:
          1.6393043 = score(doc=936,freq=1.0), product of:
            0.6396989 = queryWeight, product of:
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.07800846 = queryNorm
            2.5626185 = fieldWeight in 936, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.3125 = fieldNorm(doc=936)
        2.1590781 = weight(author_txt:kwok in 936) [ClassicSimilarity], result of:
          2.1590781 = score(doc=936,freq=1.0), product of:
            0.7686255 = queryWeight, product of:
              1.096149 = boost
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.07800846 = queryNorm
            2.8090117 = fieldWeight in 936, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.3125 = fieldNorm(doc=936)
    
  2. Kwok, K.L.: ¬The use of titles and cited titles as document representations for automatic classification (1975) 2.16
    2.1590781 = sum of:
      2.1590781 = product of:
        4.3181562 = sum of:
          4.3181562 = weight(author_txt:kwok in 4347) [ClassicSimilarity], result of:
            4.3181562 = score(doc=4347,freq=1.0), product of:
              0.7686255 = queryWeight, product of:
                1.096149 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07800846 = queryNorm
              5.6180234 = fieldWeight in 4347, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=4347)
        0.5 = coord(1/2)
    
  3. Kwok, K.L.: Employing multiple representations for Chinese information retrieval (1999) 2.16
    2.1590781 = sum of:
      2.1590781 = product of:
        4.3181562 = sum of:
          4.3181562 = weight(author_txt:kwok in 3773) [ClassicSimilarity], result of:
            4.3181562 = score(doc=3773,freq=1.0), product of:
              0.7686255 = queryWeight, product of:
                1.096149 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07800846 = queryNorm
              5.6180234 = fieldWeight in 3773, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=3773)
        0.5 = coord(1/2)
    
  4. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 2.16
    2.1590781 = sum of:
      2.1590781 = product of:
        4.3181562 = sum of:
          4.3181562 = weight(author_txt:kwok in 5696) [ClassicSimilarity], result of:
            4.3181562 = score(doc=5696,freq=1.0), product of:
              0.7686255 = queryWeight, product of:
                1.096149 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07800846 = queryNorm
              5.6180234 = fieldWeight in 5696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=5696)
        0.5 = coord(1/2)
    
  5. Kwok, K.L.: Improving English and Chinese ad-hoc retrieval : a TIPSTER text phase 3 project report (2000) 2.16
    2.1590781 = sum of:
      2.1590781 = product of:
        4.3181562 = sum of:
          4.3181562 = weight(author_txt:kwok in 6388) [ClassicSimilarity], result of:
            4.3181562 = score(doc=6388,freq=1.0), product of:
              0.7686255 = queryWeight, product of:
                1.096149 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07800846 = queryNorm
              5.6180234 = fieldWeight in 6388, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=6388)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.41
    0.4092438 = sum of:
      0.4092438 = product of:
        0.93009955 = sum of:
          0.032521922 = weight(abstract_txt:improve in 3172) [ClassicSimilarity], result of:
            0.032521922 = score(doc=3172,freq=1.0), product of:
              0.104911104 = queryWeight, product of:
                1.1960977 = boost
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.017683983 = queryNorm
              0.30999503 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.0572775 = weight(abstract_txt:depth in 3172) [ClassicSimilarity], result of:
            0.0572775 = score(doc=3172,freq=1.0), product of:
              0.15300068 = queryWeight, product of:
                1.4444504 = boost
                5.989777 = idf(docFreq=300, maxDocs=44218)
                0.017683983 = queryNorm
              0.37436107 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.989777 = idf(docFreq=300, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.076312296 = weight(abstract_txt:categorization in 3172) [ClassicSimilarity], result of:
            0.076312296 = score(doc=3172,freq=1.0), product of:
              0.18525375 = queryWeight, product of:
                1.5894228 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.017683983 = queryNorm
              0.41193387 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.11077815 = weight(abstract_txt:supervised in 3172) [ClassicSimilarity], result of:
            0.11077815 = score(doc=3172,freq=1.0), product of:
              0.23750536 = queryWeight, product of:
                1.799669 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.017683983 = queryNorm
              0.4664238 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.07425178 = weight(abstract_txt:algorithm in 3172) [ClassicSimilarity], result of:
            0.07425178 = score(doc=3172,freq=1.0), product of:
              0.20822793 = queryWeight, product of:
                2.0638163 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.017683983 = queryNorm
              0.35658893 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.071451664 = weight(abstract_txt:learning in 3172) [ClassicSimilarity], result of:
            0.071451664 = score(doc=3172,freq=1.0), product of:
              0.24063505 = queryWeight, product of:
                2.8642135 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017683983 = queryNorm
              0.29692957 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.09345416 = weight(abstract_txt:automatic in 3172) [ClassicSimilarity], result of:
            0.09345416 = score(doc=3172,freq=1.0), product of:
              0.28779492 = queryWeight, product of:
                3.13233 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017683983 = queryNorm
              0.32472485 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.0980007 = weight(abstract_txt:machine in 3172) [ClassicSimilarity], result of:
            0.0980007 = score(doc=3172,freq=1.0), product of:
              0.29705495 = queryWeight, product of:
                3.1823235 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017683983 = queryNorm
              0.32990766 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.15702371 = weight(abstract_txt:classification in 3172) [ClassicSimilarity], result of:
            0.15702371 = score(doc=3172,freq=7.0), product of:
              0.23786883 = queryWeight, product of:
                3.3694477 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017683983 = queryNorm
              0.66012734 = fieldWeight in 3172, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.07469884 = weight(abstract_txt:library in 3172) [ClassicSimilarity], result of:
            0.07469884 = score(doc=3172,freq=3.0), product of:
              0.2165357 = queryWeight, product of:
                3.8424275 = boost
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.017683983 = queryNorm
              0.3449724 = fieldWeight in 3172, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
          0.08432878 = weight(abstract_txt:document in 3172) [ClassicSimilarity], result of:
            0.08432878 = score(doc=3172,freq=1.0), product of:
              0.31432182 = queryWeight, product of:
                4.1406946 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017683983 = queryNorm
              0.26828802 = fieldWeight in 3172, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3172)
        0.44 = coord(11/25)
    
  2. Dietterich, T.G.: Machine-learning research : four current directions (1997) 0.36
    0.36234382 = sum of:
      0.36234382 = product of:
        1.294085 = sum of:
          0.04751341 = weight(abstract_txt:methods in 3321) [ClassicSimilarity], result of:
            0.04751341 = score(doc=3321,freq=1.0), product of:
              0.07333109 = queryWeight, product of:
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017683983 = queryNorm
              0.64792997 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.052908067 = weight(abstract_txt:current in 3321) [ClassicSimilarity], result of:
            0.052908067 = score(doc=3321,freq=1.0), product of:
              0.07878168 = queryWeight, product of:
                1.0364982 = boost
                4.298101 = idf(docFreq=1633, maxDocs=44218)
                0.017683983 = queryNorm
              0.6715783 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.298101 = idf(docFreq=1633, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.123915896 = weight(abstract_txt:algorithms in 3321) [ClassicSimilarity], result of:
            0.123915896 = score(doc=3321,freq=1.0), product of:
              0.13894044 = queryWeight, product of:
                1.3764812 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.017683983 = queryNorm
              0.8918634 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.27694538 = weight(abstract_txt:supervised in 3321) [ClassicSimilarity], result of:
            0.27694538 = score(doc=3321,freq=1.0), product of:
              0.23750536 = queryWeight, product of:
                1.799669 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.017683983 = queryNorm
              1.1660595 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.39942697 = weight(abstract_txt:learning in 3321) [ClassicSimilarity], result of:
            0.39942697 = score(doc=3321,freq=5.0), product of:
              0.24063505 = queryWeight, product of:
                2.8642135 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017683983 = queryNorm
              1.6598868 = fieldWeight in 3321, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.24500175 = weight(abstract_txt:machine in 3321) [ClassicSimilarity], result of:
            0.24500175 = score(doc=3321,freq=1.0), product of:
              0.29705495 = queryWeight, product of:
                3.1823235 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017683983 = queryNorm
              0.82476914 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
          0.14837348 = weight(abstract_txt:classification in 3321) [ClassicSimilarity], result of:
            0.14837348 = score(doc=3321,freq=1.0), product of:
              0.23786883 = queryWeight, product of:
                3.3694477 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017683983 = queryNorm
              0.6237618 = fieldWeight in 3321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.15625 = fieldNorm(doc=3321)
        0.28 = coord(7/25)
    
  3. Li, Y.; Shawe-Taylor, J.: Advanced learning algorithms for cross-language patent retrieval and classification (2007) 0.28
    0.2804966 = sum of:
      0.2804966 = product of:
        0.8765519 = sum of:
          0.023756705 = weight(abstract_txt:methods in 931) [ClassicSimilarity], result of:
            0.023756705 = score(doc=931,freq=1.0), product of:
              0.07333109 = queryWeight, product of:
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017683983 = queryNorm
              0.32396498 = fieldWeight in 931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.01619119 = weight(abstract_txt:based in 931) [ClassicSimilarity], result of:
            0.01619119 = score(doc=931,freq=1.0), product of:
              0.06501002 = queryWeight, product of:
                1.1531657 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.017683983 = queryNorm
              0.24905685 = fieldWeight in 931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.123915896 = weight(abstract_txt:algorithms in 931) [ClassicSimilarity], result of:
            0.123915896 = score(doc=931,freq=4.0), product of:
              0.13894044 = queryWeight, product of:
                1.3764812 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.017683983 = queryNorm
              0.8918634 = fieldWeight in 931, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.09281472 = weight(abstract_txt:algorithm in 931) [ClassicSimilarity], result of:
            0.09281472 = score(doc=931,freq=1.0), product of:
              0.20822793 = queryWeight, product of:
                2.0638163 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.017683983 = queryNorm
              0.44573617 = fieldWeight in 931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.23630416 = weight(abstract_txt:learning in 931) [ClassicSimilarity], result of:
            0.23630416 = score(doc=931,freq=7.0), product of:
              0.24063505 = queryWeight, product of:
                2.8642135 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017683983 = queryNorm
              0.98200226 = fieldWeight in 931, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.1732424 = weight(abstract_txt:machine in 931) [ClassicSimilarity], result of:
            0.1732424 = score(doc=931,freq=2.0), product of:
              0.29705495 = queryWeight, product of:
                3.1823235 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017683983 = queryNorm
              0.58319986 = fieldWeight in 931, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.10491589 = weight(abstract_txt:classification in 931) [ClassicSimilarity], result of:
            0.10491589 = score(doc=931,freq=2.0), product of:
              0.23786883 = queryWeight, product of:
                3.3694477 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017683983 = queryNorm
              0.44106615 = fieldWeight in 931, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
          0.10541097 = weight(abstract_txt:document in 931) [ClassicSimilarity], result of:
            0.10541097 = score(doc=931,freq=1.0), product of:
              0.31432182 = queryWeight, product of:
                4.1406946 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017683983 = queryNorm
              0.33536002 = fieldWeight in 931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=931)
        0.32 = coord(8/25)
    
  4. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.25
    0.24914561 = sum of:
      0.24914561 = product of:
        0.77858007 = sum of:
          0.04114782 = weight(abstract_txt:methods in 5480) [ClassicSimilarity], result of:
            0.04114782 = score(doc=5480,freq=3.0), product of:
              0.07333109 = queryWeight, product of:
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017683983 = queryNorm
              0.56112385 = fieldWeight in 5480, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.01619119 = weight(abstract_txt:based in 5480) [ClassicSimilarity], result of:
            0.01619119 = score(doc=5480,freq=1.0), product of:
              0.06501002 = queryWeight, product of:
                1.1531657 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.017683983 = queryNorm
              0.24905685 = fieldWeight in 5480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.0406524 = weight(abstract_txt:improve in 5480) [ClassicSimilarity], result of:
            0.0406524 = score(doc=5480,freq=1.0), product of:
              0.104911104 = queryWeight, product of:
                1.1960977 = boost
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.017683983 = queryNorm
              0.3874938 = fieldWeight in 5480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.12630989 = weight(abstract_txt:learning in 5480) [ClassicSimilarity], result of:
            0.12630989 = score(doc=5480,freq=2.0), product of:
              0.24063505 = queryWeight, product of:
                2.8642135 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017683983 = queryNorm
              0.5249023 = fieldWeight in 5480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.116817705 = weight(abstract_txt:automatic in 5480) [ClassicSimilarity], result of:
            0.116817705 = score(doc=5480,freq=1.0), product of:
              0.28779492 = queryWeight, product of:
                3.13233 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017683983 = queryNorm
              0.40590608 = fieldWeight in 5480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.122500874 = weight(abstract_txt:machine in 5480) [ClassicSimilarity], result of:
            0.122500874 = score(doc=5480,freq=1.0), product of:
              0.29705495 = queryWeight, product of:
                3.1823235 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017683983 = queryNorm
              0.41238457 = fieldWeight in 5480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.1658866 = weight(abstract_txt:classification in 5480) [ClassicSimilarity], result of:
            0.1658866 = score(doc=5480,freq=5.0), product of:
              0.23786883 = queryWeight, product of:
                3.3694477 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017683983 = queryNorm
              0.69738686 = fieldWeight in 5480, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
          0.14907363 = weight(abstract_txt:document in 5480) [ClassicSimilarity], result of:
            0.14907363 = score(doc=5480,freq=2.0), product of:
              0.31432182 = queryWeight, product of:
                4.1406946 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017683983 = queryNorm
              0.4742707 = fieldWeight in 5480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=5480)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.24
    0.2427533 = sum of:
      0.2427533 = product of:
        0.8669761 = sum of:
          0.028508047 = weight(abstract_txt:methods in 1595) [ClassicSimilarity], result of:
            0.028508047 = score(doc=1595,freq=1.0), product of:
              0.07333109 = queryWeight, product of:
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017683983 = queryNorm
              0.388758 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.01942943 = weight(abstract_txt:based in 1595) [ClassicSimilarity], result of:
            0.01942943 = score(doc=1595,freq=1.0), product of:
              0.06501002 = queryWeight, product of:
                1.1531657 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.017683983 = queryNorm
              0.29886824 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.16188282 = weight(abstract_txt:categorization in 1595) [ClassicSimilarity], result of:
            0.16188282 = score(doc=1595,freq=2.0), product of:
              0.18525375 = queryWeight, product of:
                1.5894228 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.017683983 = queryNorm
              0.87384367 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.1575118 = weight(abstract_txt:algorithm in 1595) [ClassicSimilarity], result of:
            0.1575118 = score(doc=1595,freq=2.0), product of:
              0.20822793 = queryWeight, product of:
                2.0638163 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.017683983 = queryNorm
              0.7564393 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.15157185 = weight(abstract_txt:learning in 1595) [ClassicSimilarity], result of:
            0.15157185 = score(doc=1595,freq=2.0), product of:
              0.24063505 = queryWeight, product of:
                2.8642135 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017683983 = queryNorm
              0.6298827 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.14018124 = weight(abstract_txt:automatic in 1595) [ClassicSimilarity], result of:
            0.14018124 = score(doc=1595,freq=1.0), product of:
              0.28779492 = queryWeight, product of:
                3.13233 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017683983 = queryNorm
              0.48708728 = fieldWeight in 1595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
          0.20789088 = weight(abstract_txt:machine in 1595) [ClassicSimilarity], result of:
            0.20789088 = score(doc=1595,freq=2.0), product of:
              0.29705495 = queryWeight, product of:
                3.1823235 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017683983 = queryNorm
              0.69983983 = fieldWeight in 1595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.09375 = fieldNorm(doc=1595)
        0.28 = coord(7/25)