Document (#11258)

Author
Cheng, P.T.K.
Wu, A.K.W.
Title
ACS: an automatic classification system
Source
Journal of information science. 21(1995) no.4, S.289-299
Year
1995
Abstract
In this paper, we introduce ACS, an automatic classification system for school libraries. First, various approaches towards automatic classification, namely (i) rule-based, (ii) browse and search, and (iii) partial match, are critically reviewed. The central issues of scheme selection, text analysis and similarity measures are discussed. A novel approach towards detecting book-class similarity with Modified Overlap Coefficient (MOC) is also proposed. Finally, the design and implementation of ACS is presented. The test result of over 80% correctness in automatic classification and a cost reduction of 75% compared to manual classification suggest that ACS is highly adoptable
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Cheng, L.R.L.: Beyond bilingualism : a quest for communicative competence (1996) 5.36
    5.364876 = sum of:
      5.364876 = weight(author_txt:cheng in 6292) [ClassicSimilarity], result of:
        5.364876 = score(doc=6292,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.116498485 = queryNorm
          5.3648763 = fieldWeight in 6292, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.625 = fieldNorm(doc=6292)
    
  2. Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 4.29
    4.2919006 = sum of:
      4.2919006 = weight(author_txt:cheng in 3812) [ClassicSimilarity], result of:
        4.2919006 = score(doc=3812,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.116498485 = queryNorm
          4.291901 = fieldWeight in 3812, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.5 = fieldNorm(doc=3812)
    
  3. Cheng, L.-y.: On bibliographic(al) control (1998) 4.29
    4.2919006 = sum of:
      4.2919006 = weight(author_txt:cheng in 5377) [ClassicSimilarity], result of:
        4.2919006 = score(doc=5377,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.116498485 = queryNorm
          4.291901 = fieldWeight in 5377, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.5 = fieldNorm(doc=5377)
    
  4. Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 3.76
    3.7554133 = sum of:
      3.7554133 = weight(author_txt:cheng in 5285) [ClassicSimilarity], result of:
        3.7554133 = score(doc=5285,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.116498485 = queryNorm
          3.7554135 = fieldWeight in 5285, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.4375 = fieldNorm(doc=5285)
    
  5. Cheng, W.-N.; Khoo, C.S.G.: Information and argument structures in Sociology research abstracts (2018) 3.76
    3.7554133 = sum of:
      3.7554133 = weight(author_txt:cheng in 751) [ClassicSimilarity], result of:
        3.7554133 = score(doc=751,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.116498485 = queryNorm
          3.7554135 = fieldWeight in 751, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.583802 = idf(docFreq=21, maxDocs=43254)
            0.4375 = fieldNorm(doc=751)
    

Similar documents (content)

  1. Gowtham, M.S.; Kamat, S.K.: ¬An expert system as a tool to classification (1995) 0.13
    0.12957075 = sum of:
      0.12957075 = product of:
        0.53987813 = sum of:
          0.08761953 = weight(abstract_txt:class in 4804) [ClassicSimilarity], result of:
            0.08761953 = score(doc=4804,freq=2.0), product of:
              0.13419423 = queryWeight, product of:
                1.076707 = boost
                5.9096537 = idf(docFreq=318, maxDocs=43254)
                0.021089887 = queryNorm
              0.6529307 = fieldWeight in 4804, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9096537 = idf(docFreq=318, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
          0.063483566 = weight(abstract_txt:manual in 4804) [ClassicSimilarity], result of:
            0.063483566 = score(doc=4804,freq=1.0), product of:
              0.13639049 = queryWeight, product of:
                1.0854821 = boost
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.021089887 = queryNorm
              0.46545446 = fieldWeight in 4804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
          0.08703584 = weight(abstract_txt:rule in 4804) [ClassicSimilarity], result of:
            0.08703584 = score(doc=4804,freq=1.0), product of:
              0.16832243 = queryWeight, product of:
                1.2058731 = boost
                6.6185994 = idf(docFreq=156, maxDocs=43254)
                0.021089887 = queryNorm
              0.5170781 = fieldWeight in 4804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6185994 = idf(docFreq=156, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
          0.0396114 = weight(abstract_txt:system in 4804) [ClassicSimilarity], result of:
            0.0396114 = score(doc=4804,freq=3.0), product of:
              0.087001406 = queryWeight, product of:
                1.2260517 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.021089887 = queryNorm
              0.45529607 = fieldWeight in 4804, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
          0.09600512 = weight(abstract_txt:modified in 4804) [ClassicSimilarity], result of:
            0.09600512 = score(doc=4804,freq=1.0), product of:
              0.17969646 = queryWeight, product of:
                1.2459494 = boost
                6.838563 = idf(docFreq=125, maxDocs=43254)
                0.021089887 = queryNorm
              0.5342627 = fieldWeight in 4804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.838563 = idf(docFreq=125, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
          0.16612265 = weight(abstract_txt:classification in 4804) [ClassicSimilarity], result of:
            0.16612265 = score(doc=4804,freq=3.0), product of:
              0.30707565 = queryWeight, product of:
                3.6419866 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021089887 = queryNorm
              0.5409828 = fieldWeight in 4804, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=4804)
        0.24 = coord(6/25)
    
  2. Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.11
    0.10865713 = sum of:
      0.10865713 = product of:
        0.67910707 = sum of:
          0.10930043 = weight(abstract_txt:match in 1054) [ClassicSimilarity], result of:
            0.10930043 = score(doc=1054,freq=2.0), product of:
              0.15550624 = queryWeight, product of:
                1.1590563 = boost
                6.361639 = idf(docFreq=202, maxDocs=43254)
                0.021089887 = queryNorm
              0.70286846 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.361639 = idf(docFreq=202, maxDocs=43254)
                0.078125 = fieldNorm(doc=1054)
          0.13971643 = weight(abstract_txt:partial in 1054) [ClassicSimilarity], result of:
            0.13971643 = score(doc=1054,freq=2.0), product of:
              0.1831604 = queryWeight, product of:
                1.2579008 = boost
                6.9041605 = idf(docFreq=117, maxDocs=43254)
                0.021089887 = queryNorm
              0.76280916 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9041605 = idf(docFreq=117, maxDocs=43254)
                0.078125 = fieldNorm(doc=1054)
          0.1918219 = weight(abstract_txt:classification in 1054) [ClassicSimilarity], result of:
            0.1918219 = score(doc=1054,freq=4.0), product of:
              0.30707565 = queryWeight, product of:
                3.6419866 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021089887 = queryNorm
              0.6246731 = fieldWeight in 1054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=1054)
          0.23826832 = weight(abstract_txt:automatic in 1054) [ClassicSimilarity], result of:
            0.23826832 = score(doc=1054,freq=2.0), product of:
              0.4150153 = queryWeight, product of:
                3.7869773 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.021089887 = queryNorm
              0.5741193 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.078125 = fieldNorm(doc=1054)
        0.16 = coord(4/25)
    
  3. Adamson, G.W.; Boreham, J.: ¬The use of an association measure based on character structure to identify semantically related pairs of words and document titles (1974) 0.10
    0.100208595 = sum of:
      0.100208595 = product of:
        0.62630373 = sum of:
          0.1672292 = weight(abstract_txt:coefficient in 2399) [ClassicSimilarity], result of:
            0.1672292 = score(doc=2399,freq=1.0), product of:
              0.23037243 = queryWeight, product of:
                1.4107364 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.021089887 = queryNorm
              0.7259081 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.09375 = fieldNorm(doc=2399)
          0.141804 = weight(abstract_txt:similarity in 2399) [ClassicSimilarity], result of:
            0.141804 = score(doc=2399,freq=1.0), product of:
              0.26003075 = queryWeight, product of:
                2.1196198 = boost
                5.8169117 = idf(docFreq=349, maxDocs=43254)
                0.021089887 = queryNorm
              0.5453355 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8169117 = idf(docFreq=349, maxDocs=43254)
                0.09375 = fieldNorm(doc=2399)
          0.11509314 = weight(abstract_txt:classification in 2399) [ClassicSimilarity], result of:
            0.11509314 = score(doc=2399,freq=1.0), product of:
              0.30707565 = queryWeight, product of:
                3.6419866 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021089887 = queryNorm
              0.37480387 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.09375 = fieldNorm(doc=2399)
          0.20217739 = weight(abstract_txt:automatic in 2399) [ClassicSimilarity], result of:
            0.20217739 = score(doc=2399,freq=1.0), product of:
              0.4150153 = queryWeight, product of:
                3.7869773 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.021089887 = queryNorm
              0.48715645 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.09375 = fieldNorm(doc=2399)
        0.16 = coord(4/25)
    
  4. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.10
    0.099535 = sum of:
      0.099535 = product of:
        0.497675 = sum of:
          0.05162935 = weight(abstract_txt:novel in 401) [ClassicSimilarity], result of:
            0.05162935 = score(doc=401,freq=1.0), product of:
              0.11883408 = queryWeight, product of:
                1.0132139 = boost
                5.561163 = idf(docFreq=451, maxDocs=43254)
                0.021089887 = queryNorm
              0.43446586 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.561163 = idf(docFreq=451, maxDocs=43254)
                0.078125 = fieldNorm(doc=401)
          0.063483566 = weight(abstract_txt:manual in 401) [ClassicSimilarity], result of:
            0.063483566 = score(doc=401,freq=1.0), product of:
              0.13639049 = queryWeight, product of:
                1.0854821 = boost
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.021089887 = queryNorm
              0.46545446 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.078125 = fieldNorm(doc=401)
          0.11816999 = weight(abstract_txt:similarity in 401) [ClassicSimilarity], result of:
            0.11816999 = score(doc=401,freq=1.0), product of:
              0.26003075 = queryWeight, product of:
                2.1196198 = boost
                5.8169117 = idf(docFreq=349, maxDocs=43254)
                0.021089887 = queryNorm
              0.45444623 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8169117 = idf(docFreq=349, maxDocs=43254)
                0.078125 = fieldNorm(doc=401)
          0.09591095 = weight(abstract_txt:classification in 401) [ClassicSimilarity], result of:
            0.09591095 = score(doc=401,freq=1.0), product of:
              0.30707565 = queryWeight, product of:
                3.6419866 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021089887 = queryNorm
              0.31233656 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=401)
          0.16848114 = weight(abstract_txt:automatic in 401) [ClassicSimilarity], result of:
            0.16848114 = score(doc=401,freq=1.0), product of:
              0.4150153 = queryWeight, product of:
                3.7869773 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.021089887 = queryNorm
              0.4059637 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.078125 = fieldNorm(doc=401)
        0.2 = coord(5/25)
    
  5. Xu, L.; Qiu, J.: Unsupervised multi-class sentiment classification approach (2019) 0.10
    0.09917953 = sum of:
      0.09917953 = product of:
        0.49589765 = sum of:
          0.04130348 = weight(abstract_txt:novel in 4) [ClassicSimilarity], result of:
            0.04130348 = score(doc=4,freq=1.0), product of:
              0.11883408 = queryWeight, product of:
                1.0132139 = boost
                5.561163 = idf(docFreq=451, maxDocs=43254)
                0.021089887 = queryNorm
              0.34757268 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.561163 = idf(docFreq=451, maxDocs=43254)
                0.0625 = fieldNorm(doc=4)
          0.11083091 = weight(abstract_txt:class in 4) [ClassicSimilarity], result of:
            0.11083091 = score(doc=4,freq=5.0), product of:
              0.13419423 = queryWeight, product of:
                1.076707 = boost
                5.9096537 = idf(docFreq=318, maxDocs=43254)
                0.021089887 = queryNorm
              0.82589924 = fieldWeight in 4, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.9096537 = idf(docFreq=318, maxDocs=43254)
                0.0625 = fieldNorm(doc=4)
          0.05078685 = weight(abstract_txt:manual in 4) [ClassicSimilarity], result of:
            0.05078685 = score(doc=4,freq=1.0), product of:
              0.13639049 = queryWeight, product of:
                1.0854821 = boost
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.021089887 = queryNorm
              0.37236357 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.957817 = idf(docFreq=303, maxDocs=43254)
                0.0625 = fieldNorm(doc=4)
          0.089971185 = weight(abstract_txt:reduction in 4) [ClassicSimilarity], result of:
            0.089971185 = score(doc=4,freq=1.0), product of:
              0.1996881 = queryWeight, product of:
                1.3134295 = boost
                7.2089367 = idf(docFreq=86, maxDocs=43254)
                0.021089887 = queryNorm
              0.45055854 = fieldWeight in 4, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2089367 = idf(docFreq=86, maxDocs=43254)
                0.0625 = fieldNorm(doc=4)
          0.20300521 = weight(abstract_txt:classification in 4) [ClassicSimilarity], result of:
            0.20300521 = score(doc=4,freq=7.0), product of:
              0.30707565 = queryWeight, product of:
                3.6419866 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021089887 = queryNorm
              0.66109186 = fieldWeight in 4, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=4)
        0.2 = coord(5/25)