Document (#11258)

Author
Cheng, P.T.K.
Wu, A.K.W.
Title
ACS: an automatic classification system
Source
Journal of information science. 21(1995) no.4, S.289-299
Year
1995
Abstract
In this paper, we introduce ACS, an automatic classification system for school libraries. First, various approaches towards automatic classification, namely (i) rule-based, (ii) browse and search, and (iii) partial match, are critically reviewed. The central issues of scheme selection, text analysis and similarity measures are discussed. A novel approach towards detecting book-class similarity with Modified Overlap Coefficient (MOC) is also proposed. Finally, the design and implementation of ACS is presented. The test result of over 80% correctness in automatic classification and a cost reduction of 75% compared to manual classification suggest that ACS is highly adoptable
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Cheng, L.R.L.: Beyond bilingualism : a quest for communicative competence (1996) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:cheng in 5223) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 5223, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=5223)
    
  2. Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 4.16
    4.164796 = sum of:
      4.164796 = weight(author_txt:cheng in 1811) [ClassicSimilarity], result of:
        4.164796 = fieldWeight in 1811, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.5 = fieldNorm(doc=1811)
    
  3. Cheng, L.-y.: On bibliographic(al) control (1998) 4.16
    4.164796 = sum of:
      4.164796 = weight(author_txt:cheng in 3376) [ClassicSimilarity], result of:
        4.164796 = fieldWeight in 3376, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.5 = fieldNorm(doc=3376)
    
  4. Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 3.64
    3.6441965 = sum of:
      3.6441965 = weight(author_txt:cheng in 4216) [ClassicSimilarity], result of:
        3.6441965 = fieldWeight in 4216, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.4375 = fieldNorm(doc=4216)
    
  5. Cheng, W.-N.; Khoo, C.S.G.: Information and argument structures in Sociology research abstracts (2018) 3.64
    3.6441965 = sum of:
      3.6441965 = weight(author_txt:cheng in 4750) [ClassicSimilarity], result of:
        3.6441965 = fieldWeight in 4750, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.4375 = fieldNorm(doc=4750)
    

Similar documents (content)

  1. Gowtham, M.S.; Kamat, S.K.: ¬An expert system as a tool to classification (1995) 0.18
    0.17917442 = sum of:
      0.17917442 = product of:
        0.6399087 = sum of:
          0.09925355 = weight(abstract_txt:scheme in 3735) [ClassicSimilarity], result of:
            0.09925355 = score(doc=3735,freq=4.0), product of:
              0.11586837 = queryWeight, product of:
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.021135071 = queryNorm
              0.8566061 = fieldWeight in 3735, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.087112606 = weight(abstract_txt:class in 3735) [ClassicSimilarity], result of:
            0.087112606 = score(doc=3735,freq=2.0), product of:
              0.13382323 = queryWeight, product of:
                1.0746902 = boost
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.021135071 = queryNorm
              0.6509528 = fieldWeight in 3735, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.06346485 = weight(abstract_txt:manual in 3735) [ClassicSimilarity], result of:
            0.06346485 = score(doc=3735,freq=1.0), product of:
              0.13651374 = queryWeight, product of:
                1.0854398 = boost
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.021135071 = queryNorm
              0.4648972 = fieldWeight in 3735, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.08720015 = weight(abstract_txt:rule in 3735) [ClassicSimilarity], result of:
            0.08720015 = score(doc=3735,freq=1.0), product of:
              0.16871963 = queryWeight, product of:
                1.206703 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.021135071 = queryNorm
              0.5168346 = fieldWeight in 3735, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.04001337 = weight(abstract_txt:system in 3735) [ClassicSimilarity], result of:
            0.04001337 = score(doc=3735,freq=3.0), product of:
              0.08768538 = queryWeight, product of:
                1.2302579 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.021135071 = queryNorm
              0.45632887 = fieldWeight in 3735, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.096920975 = weight(abstract_txt:modified in 3735) [ClassicSimilarity], result of:
            0.096920975 = score(doc=3735,freq=1.0), product of:
              0.18103644 = queryWeight, product of:
                1.2499728 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.021135071 = queryNorm
              0.5353672 = fieldWeight in 3735, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
          0.16594315 = weight(abstract_txt:classification in 3735) [ClassicSimilarity], result of:
            0.16594315 = score(doc=3735,freq=3.0), product of:
              0.30719206 = queryWeight, product of:
                3.6408901 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021135071 = queryNorm
              0.5401935 = fieldWeight in 3735, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=3735)
        0.28 = coord(7/25)
    
  2. Pong, J.Y.-H.; Kwok, R.C.-W.; Lau, R.Y.-K.; Hao, J.-X.; Wong, P.C.-C.: ¬A comparative study of two automatic document classification methods in a library setting (2008) 0.12
    0.122373246 = sum of:
      0.122373246 = product of:
        0.61186624 = sum of:
          0.03970142 = weight(abstract_txt:scheme in 2532) [ClassicSimilarity], result of:
            0.03970142 = score(doc=2532,freq=1.0), product of:
              0.11586837 = queryWeight, product of:
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.021135071 = queryNorm
              0.34264246 = fieldWeight in 2532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.0625 = fieldNorm(doc=2532)
          0.08793948 = weight(abstract_txt:manual in 2532) [ClassicSimilarity], result of:
            0.08793948 = score(doc=2532,freq=3.0), product of:
              0.13651374 = queryWeight, product of:
                1.0854398 = boost
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.021135071 = queryNorm
              0.6441804 = fieldWeight in 2532, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.0625 = fieldNorm(doc=2532)
          0.026136624 = weight(abstract_txt:system in 2532) [ClassicSimilarity], result of:
            0.026136624 = score(doc=2532,freq=2.0), product of:
              0.08768538 = queryWeight, product of:
                1.2302579 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.021135071 = queryNorm
              0.2980728 = fieldWeight in 2532, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=2532)
          0.18774325 = weight(abstract_txt:classification in 2532) [ClassicSimilarity], result of:
            0.18774325 = score(doc=2532,freq=6.0), product of:
              0.30719206 = queryWeight, product of:
                3.6408901 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021135071 = queryNorm
              0.6111592 = fieldWeight in 2532, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2532)
          0.27034548 = weight(abstract_txt:automatic in 2532) [ClassicSimilarity], result of:
            0.27034548 = score(doc=2532,freq=4.0), product of:
              0.41626853 = queryWeight, product of:
                3.7908304 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.021135071 = queryNorm
              0.6494497 = fieldWeight in 2532, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=2532)
        0.2 = coord(5/25)
    
  3. Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.11
    0.10885689 = sum of:
      0.10885689 = product of:
        0.68035555 = sum of:
          0.11029571 = weight(abstract_txt:match in 1054) [ClassicSimilarity], result of:
            0.11029571 = score(doc=1054,freq=2.0), product of:
              0.15662098 = queryWeight, product of:
                1.1626327 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.021135071 = queryNorm
              0.70422053 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.078125 = fieldNorm(doc=1054)
          0.1394913 = weight(abstract_txt:partial in 1054) [ClassicSimilarity], result of:
            0.1394913 = score(doc=1054,freq=2.0), product of:
              0.1831649 = queryWeight, product of:
                1.2572993 = boost
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.021135071 = queryNorm
              0.76156133 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.078125 = fieldNorm(doc=1054)
          0.19161466 = weight(abstract_txt:classification in 1054) [ClassicSimilarity], result of:
            0.19161466 = score(doc=1054,freq=4.0), product of:
              0.30719206 = queryWeight, product of:
                3.6408901 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021135071 = queryNorm
              0.6237618 = fieldWeight in 1054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=1054)
          0.23895389 = weight(abstract_txt:automatic in 1054) [ClassicSimilarity], result of:
            0.23895389 = score(doc=1054,freq=2.0), product of:
              0.41626853 = queryWeight, product of:
                3.7908304 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.021135071 = queryNorm
              0.57403785 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=1054)
        0.16 = coord(4/25)
    
  4. Adamson, G.W.; Boreham, J.: ¬The use of an association measure based on character structure to identify semantically related pairs of words and document titles (1974) 0.10
    0.100701384 = sum of:
      0.100701384 = product of:
        0.6293837 = sum of:
          0.16921878 = weight(abstract_txt:coefficient in 398) [ClassicSimilarity], result of:
            0.16921878 = score(doc=398,freq=1.0), product of:
              0.2324515 = queryWeight, product of:
                1.4163929 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.021135071 = queryNorm
              0.72797453 = fieldWeight in 398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.09375 = fieldNorm(doc=398)
          0.14243701 = weight(abstract_txt:similarity in 398) [ClassicSimilarity], result of:
            0.14243701 = score(doc=398,freq=1.0), product of:
              0.261091 = queryWeight, product of:
                2.122895 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.021135071 = queryNorm
              0.54554546 = fieldWeight in 398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.09375 = fieldNorm(doc=398)
          0.1149688 = weight(abstract_txt:classification in 398) [ClassicSimilarity], result of:
            0.1149688 = score(doc=398,freq=1.0), product of:
              0.30719206 = queryWeight, product of:
                3.6408901 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021135071 = queryNorm
              0.37425706 = fieldWeight in 398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.09375 = fieldNorm(doc=398)
          0.2027591 = weight(abstract_txt:automatic in 398) [ClassicSimilarity], result of:
            0.2027591 = score(doc=398,freq=1.0), product of:
              0.41626853 = queryWeight, product of:
                3.7908304 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.021135071 = queryNorm
              0.48708728 = fieldWeight in 398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=398)
        0.16 = coord(4/25)
    
  5. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.10
    0.09957045 = sum of:
      0.09957045 = product of:
        0.49785227 = sum of:
          0.050916642 = weight(abstract_txt:novel in 5400) [ClassicSimilarity], result of:
            0.050916642 = score(doc=5400,freq=1.0), product of:
              0.11786748 = queryWeight, product of:
                1.0085897 = boost
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.021135071 = queryNorm
              0.4319821 = fieldWeight in 5400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.078125 = fieldNorm(doc=5400)
          0.06346485 = weight(abstract_txt:manual in 5400) [ClassicSimilarity], result of:
            0.06346485 = score(doc=5400,freq=1.0), product of:
              0.13651374 = queryWeight, product of:
                1.0854398 = boost
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.021135071 = queryNorm
              0.4648972 = fieldWeight in 5400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.078125 = fieldNorm(doc=5400)
          0.1186975 = weight(abstract_txt:similarity in 5400) [ClassicSimilarity], result of:
            0.1186975 = score(doc=5400,freq=1.0), product of:
              0.261091 = queryWeight, product of:
                2.122895 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.021135071 = queryNorm
              0.4546212 = fieldWeight in 5400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.078125 = fieldNorm(doc=5400)
          0.09580733 = weight(abstract_txt:classification in 5400) [ClassicSimilarity], result of:
            0.09580733 = score(doc=5400,freq=1.0), product of:
              0.30719206 = queryWeight, product of:
                3.6408901 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021135071 = queryNorm
              0.3118809 = fieldWeight in 5400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=5400)
          0.16896592 = weight(abstract_txt:automatic in 5400) [ClassicSimilarity], result of:
            0.16896592 = score(doc=5400,freq=1.0), product of:
              0.41626853 = queryWeight, product of:
                3.7908304 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.021135071 = queryNorm
              0.40590608 = fieldWeight in 5400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=5400)
        0.2 = coord(5/25)