Document (#30441)

Author
Bloomfield, M.
Title
Indexing : neglected and poorly understood
Source
Cataloging and classification quarterly. 33(2001) no.1, S.63-75
Year
2001
Abstract
The growth of the Internet has highlighted the use of machine indexing. The difficulties in using the Internet as a searching device can be frustrating. The use of the term "Python" is given as an example. Machine indexing is noted as "rotten" and human indexing as "capricious." The problem seems to be a lack of a theoretical foundation for the art of indexing. What librarians have learned over the last hundred years has yet to yield a consistent approach to what really works best in preparing index terms and in the ability of our customers to search the various indexes. An attempt is made to consider the elements of indexing, their pros and cons. The argument is made that machine indexing is far too prolific in its production of index terms. Neither librarians nor computer programmers have made much progress to improve Internet indexing. Human indexing has had the same problems for over fifty years.
Footnote
Vgl. auch: http://catalogingandclassificationquarterly.com/
Theme
Automatisches Indexieren
Internet

Similar documents (content)

  1. Lancaster, F.W.: Trends in subject indexing from 1957 to 2000 (1980) 0.17
    0.16808856 = sum of:
      0.16808856 = product of:
        0.8404428 = sum of:
          0.052410755 = weight(abstract_txt:terms in 209) [ClassicSimilarity], result of:
            0.052410755 = score(doc=209,freq=2.0), product of:
              0.09737063 = queryWeight, product of:
                1.1167667 = boost
                4.059814 = idf(docFreq=1983, maxDocs=42306)
                0.021476297 = queryNorm
              0.5382604 = fieldWeight in 209, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.059814 = idf(docFreq=1983, maxDocs=42306)
                0.09375 = fieldNorm(doc=209)
          0.08314881 = weight(abstract_txt:index in 209) [ClassicSimilarity], result of:
            0.08314881 = score(doc=209,freq=2.0), product of:
              0.13244992 = queryWeight, product of:
                1.3024898 = boost
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.021476297 = queryNorm
              0.62777543 = fieldWeight in 209, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.09375 = fieldNorm(doc=209)
          0.0795178 = weight(abstract_txt:made in 209) [ClassicSimilarity], result of:
            0.0795178 = score(doc=209,freq=1.0), product of:
              0.18542333 = queryWeight, product of:
                1.8874536 = boost
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.021476297 = queryNorm
              0.42884463 = fieldWeight in 209, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.09375 = fieldNorm(doc=209)
          0.12695937 = weight(abstract_txt:machine in 209) [ClassicSimilarity], result of:
            0.12695937 = score(doc=209,freq=1.0), product of:
              0.25329775 = queryWeight, product of:
                2.2060215 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.021476297 = queryNorm
              0.5012258 = fieldWeight in 209, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.09375 = fieldNorm(doc=209)
          0.49840602 = weight(abstract_txt:indexing in 209) [ClassicSimilarity], result of:
            0.49840602 = score(doc=209,freq=6.0), product of:
              0.5003036 = queryWeight, product of:
                5.369966 = boost
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.021476297 = queryNorm
              0.9962071 = fieldWeight in 209, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.09375 = fieldNorm(doc=209)
        0.2 = coord(5/25)
    
  2. Lauser, B.; Johannsen, G.; Caracciolo, C.; Hage, W.R. van; Keizer, J.; Mayr, P.: Comparing human and automatic thesaurus mapping approaches in the agricultural domain (2008) 0.13
    0.1321564 = sum of:
      0.1321564 = product of:
        0.55065167 = sum of:
          0.12663151 = weight(abstract_txt:cons in 447) [ClassicSimilarity], result of:
            0.12663151 = score(doc=447,freq=1.0), product of:
              0.19798385 = queryWeight, product of:
                1.1260258 = boost
                8.186948 = idf(docFreq=31, maxDocs=42306)
                0.021476297 = queryNorm
              0.6396053 = fieldWeight in 447, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.186948 = idf(docFreq=31, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
          0.12811045 = weight(abstract_txt:pros in 447) [ClassicSimilarity], result of:
            0.12811045 = score(doc=447,freq=1.0), product of:
              0.19952238 = queryWeight, product of:
                1.1303924 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.021476297 = queryNorm
              0.6420857 = fieldWeight in 447, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
          0.0539844 = weight(abstract_txt:what in 447) [ClassicSimilarity], result of:
            0.0539844 = score(doc=447,freq=2.0), product of:
              0.11214521 = queryWeight, product of:
                1.1985022 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.021476297 = queryNorm
              0.48137945 = fieldWeight in 447, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
          0.06986096 = weight(abstract_txt:human in 447) [ClassicSimilarity], result of:
            0.06986096 = score(doc=447,freq=2.0), product of:
              0.13317569 = queryWeight, product of:
                1.3060534 = boost
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.021476297 = queryNorm
              0.52457744 = fieldWeight in 447, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
          0.06626483 = weight(abstract_txt:made in 447) [ClassicSimilarity], result of:
            0.06626483 = score(doc=447,freq=1.0), product of:
              0.18542333 = queryWeight, product of:
                1.8874536 = boost
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.021476297 = queryNorm
              0.35737053 = fieldWeight in 447, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
          0.105799474 = weight(abstract_txt:machine in 447) [ClassicSimilarity], result of:
            0.105799474 = score(doc=447,freq=1.0), product of:
              0.25329775 = queryWeight, product of:
                2.2060215 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.021476297 = queryNorm
              0.4176882 = fieldWeight in 447, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.078125 = fieldNorm(doc=447)
        0.24 = coord(6/25)
    
  3. Mooers, C.N.: ¬The indexing language of an information retrieval system (1985) 0.13
    0.1260202 = sum of:
      0.1260202 = product of:
        0.39381313 = sum of:
          0.059395537 = weight(abstract_txt:hundred in 4645) [ClassicSimilarity], result of:
            0.059395537 = score(doc=4645,freq=1.0), product of:
              0.1680107 = queryWeight, product of:
                1.0372941 = boost
                7.5418105 = idf(docFreq=60, maxDocs=42306)
                0.021476297 = queryNorm
              0.35352236 = fieldWeight in 4645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5418105 = idf(docFreq=60, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.026205378 = weight(abstract_txt:terms in 4645) [ClassicSimilarity], result of:
            0.026205378 = score(doc=4645,freq=2.0), product of:
              0.09737063 = queryWeight, product of:
                1.1167667 = boost
                4.059814 = idf(docFreq=1983, maxDocs=42306)
                0.021476297 = queryNorm
              0.2691302 = fieldWeight in 4645, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.059814 = idf(docFreq=1983, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.021539832 = weight(abstract_txt:over in 4645) [ClassicSimilarity], result of:
            0.021539832 = score(doc=4645,freq=1.0), product of:
              0.107647985 = queryWeight, product of:
                1.1742252 = boost
                4.268695 = idf(docFreq=1609, maxDocs=42306)
                0.021476297 = queryNorm
              0.20009507 = fieldWeight in 4645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.268695 = idf(docFreq=1609, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.022903642 = weight(abstract_txt:what in 4645) [ClassicSimilarity], result of:
            0.022903642 = score(doc=4645,freq=1.0), product of:
              0.11214521 = queryWeight, product of:
                1.1985022 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.021476297 = queryNorm
              0.204232 = fieldWeight in 4645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.029214418 = weight(abstract_txt:years in 4645) [ClassicSimilarity], result of:
            0.029214418 = score(doc=4645,freq=1.0), product of:
              0.13189931 = queryWeight, product of:
                1.2997797 = boost
                4.7251263 = idf(docFreq=1019, maxDocs=42306)
                0.021476297 = queryNorm
              0.2214903 = fieldWeight in 4645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7251263 = idf(docFreq=1019, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.050918035 = weight(abstract_txt:index in 4645) [ClassicSimilarity], result of:
            0.050918035 = score(doc=4645,freq=3.0), product of:
              0.13244992 = queryWeight, product of:
                1.3024898 = boost
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.021476297 = queryNorm
              0.38443235 = fieldWeight in 4645, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.0397589 = weight(abstract_txt:made in 4645) [ClassicSimilarity], result of:
            0.0397589 = score(doc=4645,freq=1.0), product of:
              0.18542333 = queryWeight, product of:
                1.8874536 = boost
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.021476297 = queryNorm
              0.21442232 = fieldWeight in 4645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
          0.14387742 = weight(abstract_txt:indexing in 4645) [ClassicSimilarity], result of:
            0.14387742 = score(doc=4645,freq=2.0), product of:
              0.5003036 = queryWeight, product of:
                5.369966 = boost
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.021476297 = queryNorm
              0.2875802 = fieldWeight in 4645, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.046875 = fieldNorm(doc=4645)
        0.32 = coord(8/25)
    
  4. Carroll, D.J.; Lele, P.: Human intervention in the networked environment : metadata alternatives (1998) 0.13
    0.12544757 = sum of:
      0.12544757 = product of:
        0.62723786 = sum of:
          0.15195782 = weight(abstract_txt:cons in 3222) [ClassicSimilarity], result of:
            0.15195782 = score(doc=3222,freq=1.0), product of:
              0.19798385 = queryWeight, product of:
                1.1260258 = boost
                8.186948 = idf(docFreq=31, maxDocs=42306)
                0.021476297 = queryNorm
              0.7675264 = fieldWeight in 3222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.186948 = idf(docFreq=31, maxDocs=42306)
                0.09375 = fieldNorm(doc=3222)
          0.15373255 = weight(abstract_txt:pros in 3222) [ClassicSimilarity], result of:
            0.15373255 = score(doc=3222,freq=1.0), product of:
              0.19952238 = queryWeight, product of:
                1.1303924 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.021476297 = queryNorm
              0.7705028 = fieldWeight in 3222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.09375 = fieldNorm(doc=3222)
          0.058795083 = weight(abstract_txt:index in 3222) [ClassicSimilarity], result of:
            0.058795083 = score(doc=3222,freq=1.0), product of:
              0.13244992 = queryWeight, product of:
                1.3024898 = boost
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.021476297 = queryNorm
              0.44390425 = fieldWeight in 3222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7349787 = idf(docFreq=1009, maxDocs=42306)
                0.09375 = fieldNorm(doc=3222)
          0.059278995 = weight(abstract_txt:human in 3222) [ClassicSimilarity], result of:
            0.059278995 = score(doc=3222,freq=1.0), product of:
              0.13317569 = queryWeight, product of:
                1.3060534 = boost
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.021476297 = queryNorm
              0.44511876 = fieldWeight in 3222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.09375 = fieldNorm(doc=3222)
          0.2034734 = weight(abstract_txt:indexing in 3222) [ClassicSimilarity], result of:
            0.2034734 = score(doc=3222,freq=1.0), product of:
              0.5003036 = queryWeight, product of:
                5.369966 = boost
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.021476297 = queryNorm
              0.40669984 = fieldWeight in 3222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.09375 = fieldNorm(doc=3222)
        0.2 = coord(5/25)
    
  5. Pong, J.Y.-H.; Kwok, R.C.-W.; Lau, R.Y.-K.; Hao, J.-X.; Wong, P.C.-C.: ¬A comparative study of two automatic document classification methods in a library setting (2008) 0.12
    0.11627327 = sum of:
      0.11627327 = product of:
        0.48447198 = sum of:
          0.038952556 = weight(abstract_txt:years in 352) [ClassicSimilarity], result of:
            0.038952556 = score(doc=352,freq=1.0), product of:
              0.13189931 = queryWeight, product of:
                1.2997797 = boost
                4.7251263 = idf(docFreq=1019, maxDocs=42306)
                0.021476297 = queryNorm
              0.2953204 = fieldWeight in 352, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7251263 = idf(docFreq=1019, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
          0.03951933 = weight(abstract_txt:human in 352) [ClassicSimilarity], result of:
            0.03951933 = score(doc=352,freq=1.0), product of:
              0.13317569 = queryWeight, product of:
                1.3060534 = boost
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.021476297 = queryNorm
              0.29674584 = fieldWeight in 352, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7479334 = idf(docFreq=996, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
          0.028079424 = weight(abstract_txt:internet in 352) [ClassicSimilarity], result of:
            0.028079424 = score(doc=352,freq=1.0), product of:
              0.12138763 = queryWeight, product of:
                1.5271486 = boost
                3.701125 = idf(docFreq=2839, maxDocs=42306)
                0.021476297 = queryNorm
              0.2313203 = fieldWeight in 352, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.701125 = idf(docFreq=2839, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
          0.053011864 = weight(abstract_txt:made in 352) [ClassicSimilarity], result of:
            0.053011864 = score(doc=352,freq=1.0), product of:
              0.18542333 = queryWeight, product of:
                1.8874536 = boost
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.021476297 = queryNorm
              0.28589642 = fieldWeight in 352, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5743427 = idf(docFreq=1185, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
          0.18925987 = weight(abstract_txt:machine in 352) [ClassicSimilarity], result of:
            0.18925987 = score(doc=352,freq=5.0), product of:
              0.25329775 = queryWeight, product of:
                2.2060215 = boost
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.021476297 = queryNorm
              0.7471834 = fieldWeight in 352, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.346409 = idf(docFreq=547, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
          0.13564894 = weight(abstract_txt:indexing in 352) [ClassicSimilarity], result of:
            0.13564894 = score(doc=352,freq=1.0), product of:
              0.5003036 = queryWeight, product of:
                5.369966 = boost
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.021476297 = queryNorm
              0.2711332 = fieldWeight in 352, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3381314 = idf(docFreq=1501, maxDocs=42306)
                0.0625 = fieldNorm(doc=352)
        0.24 = coord(6/25)