Document (#30368)

Author
Pappas, E.
Herendeen, A.
Title
Enhancing bibliographic records with tables of contents derived from OCR technologies at the American Museum of Natural History Library
Source
Cataloging and classification quarterly. 29(2000) no.4, S.61-72
Year
2000
Abstract
This paper reports on a project undertaken at the American Museum of Natural History Library in 1997 and intended to enhance access to materials in the library's collection by using scanning and OCR software to digitize and add monograph tables of contents to the OPAC bibliographic records. Initially, conference proceedings already in the collection were used, but, as the project developed, other types of materials were also used. The rationale for the project is explained, the procedure developed is described, and the lessons learned from using this particular technology are outlined.
Theme
Kataloganreicherung

Similar documents (content)

  1. ¬The National Digital Library (1994) 0.25
    0.24533775 = sum of:
      0.24533775 = product of:
        1.0222406 = sum of:
          0.15650691 = weight(abstract_txt:initially in 1764) [ClassicSimilarity], result of:
            0.15650691 = score(doc=1764,freq=1.0), product of:
              0.19788751 = queryWeight, product of:
                1.1021005 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.024831336 = queryNorm
              0.7908883 = fieldWeight in 1764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
          0.06115305 = weight(abstract_txt:developed in 1764) [ClassicSimilarity], result of:
            0.06115305 = score(doc=1764,freq=1.0), product of:
              0.13325538 = queryWeight, product of:
                1.2789966 = boost
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.024831336 = queryNorm
              0.4589162 = fieldWeight in 1764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
          0.30762008 = weight(abstract_txt:digitize in 1764) [ClassicSimilarity], result of:
            0.30762008 = score(doc=1764,freq=1.0), product of:
              0.31050777 = queryWeight, product of:
                1.3805376 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.024831336 = queryNorm
              0.9907001 = fieldWeight in 1764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
          0.09807573 = weight(abstract_txt:materials in 1764) [ClassicSimilarity], result of:
            0.09807573 = score(doc=1764,freq=1.0), product of:
              0.18257742 = queryWeight, product of:
                1.497099 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.024831336 = queryNorm
              0.5371734 = fieldWeight in 1764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
          0.19042416 = weight(abstract_txt:american in 1764) [ClassicSimilarity], result of:
            0.19042416 = score(doc=1764,freq=2.0), product of:
              0.22553332 = queryWeight, product of:
                1.6639197 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.024831336 = queryNorm
              0.8443283 = fieldWeight in 1764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
          0.20846075 = weight(abstract_txt:project in 1764) [ClassicSimilarity], result of:
            0.20846075 = score(doc=1764,freq=4.0), product of:
              0.21765365 = queryWeight, product of:
                2.001961 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.024831336 = queryNorm
              0.9577636 = fieldWeight in 1764, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.109375 = fieldNorm(doc=1764)
        0.24 = coord(6/25)
    
  2. DeVorsey, K.L.; Elson, C.; Gregorev, N.P.; Hansen, J.: ¬The development of a local thesaurus to improve access to the anthropological collections of the American Museum of Natural History (2006) 0.21
    0.21199647 = sum of:
      0.21199647 = product of:
        0.52999115 = sum of:
          0.01569236 = weight(abstract_txt:used in 1174) [ClassicSimilarity], result of:
            0.01569236 = score(doc=1174,freq=1.0), product of:
              0.08541842 = queryWeight, product of:
                1.0240066 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.024831336 = queryNorm
              0.18371168 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.028938638 = weight(abstract_txt:were in 1174) [ClassicSimilarity], result of:
            0.028938638 = score(doc=1174,freq=2.0), product of:
              0.101953335 = queryWeight, product of:
                1.1187363 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.024831336 = queryNorm
              0.283842 = fieldWeight in 1174, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.031319465 = weight(abstract_txt:bibliographic in 1174) [ClassicSimilarity], result of:
            0.031319465 = score(doc=1174,freq=1.0), product of:
              0.13540527 = queryWeight, product of:
                1.2892727 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.024831336 = queryNorm
              0.23130167 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.035735745 = weight(abstract_txt:records in 1174) [ClassicSimilarity], result of:
            0.035735745 = score(doc=1174,freq=1.0), product of:
              0.14785224 = queryWeight, product of:
                1.3472276 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.024831336 = queryNorm
              0.24169904 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.041579667 = weight(abstract_txt:collection in 1174) [ClassicSimilarity], result of:
            0.041579667 = score(doc=1174,freq=1.0), product of:
              0.1635611 = queryWeight, product of:
                1.4169908 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.024831336 = queryNorm
              0.25421488 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.05109639 = weight(abstract_txt:history in 1174) [ClassicSimilarity], result of:
            0.05109639 = score(doc=1174,freq=1.0), product of:
              0.18765184 = queryWeight, product of:
                1.5177611 = boost
                4.9790826 = idf(docFreq=826, maxDocs=44218)
                0.024831336 = queryNorm
              0.27229357 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9790826 = idf(docFreq=826, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.05425018 = weight(abstract_txt:natural in 1174) [ClassicSimilarity], result of:
            0.05425018 = score(doc=1174,freq=1.0), product of:
              0.19529605 = queryWeight, product of:
                1.5483664 = boost
                5.0794845 = idf(docFreq=747, maxDocs=44218)
                0.024831336 = queryNorm
              0.27778432 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0794845 = idf(docFreq=747, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.06732511 = weight(abstract_txt:american in 1174) [ClassicSimilarity], result of:
            0.06732511 = score(doc=1174,freq=1.0), product of:
              0.22553332 = queryWeight, product of:
                1.6639197 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.024831336 = queryNorm
              0.29851514 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.15193836 = weight(abstract_txt:museum in 1174) [ClassicSimilarity], result of:
            0.15193836 = score(doc=1174,freq=2.0), product of:
              0.30798364 = queryWeight, product of:
                1.9444234 = boost
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.024831336 = queryNorm
              0.4933326 = fieldWeight in 1174, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
          0.052115187 = weight(abstract_txt:project in 1174) [ClassicSimilarity], result of:
            0.052115187 = score(doc=1174,freq=1.0), product of:
              0.21765365 = queryWeight, product of:
                2.001961 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.024831336 = queryNorm
              0.2394409 = fieldWeight in 1174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1174)
        0.4 = coord(10/25)
    
  3. Makinen, R.H.; Friesen B.: Enhancing online bibliographic records to improve retrieval of reference collection monographs (1995) 0.19
    0.19021137 = sum of:
      0.19021137 = product of:
        0.95105684 = sum of:
          0.10210213 = weight(abstract_txt:records in 1700) [ClassicSimilarity], result of:
            0.10210213 = score(doc=1700,freq=1.0), product of:
              0.14785224 = queryWeight, product of:
                1.3472276 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.024831336 = queryNorm
              0.6905687 = fieldWeight in 1700, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.15625 = fieldNorm(doc=1700)
          0.11879905 = weight(abstract_txt:collection in 1700) [ClassicSimilarity], result of:
            0.11879905 = score(doc=1700,freq=1.0), product of:
              0.1635611 = queryWeight, product of:
                1.4169908 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.024831336 = queryNorm
              0.72632825 = fieldWeight in 1700, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.15625 = fieldNorm(doc=1700)
          0.22878122 = weight(abstract_txt:contents in 1700) [ClassicSimilarity], result of:
            0.22878122 = score(doc=1700,freq=1.0), product of:
              0.25317353 = queryWeight, product of:
                1.7629344 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.024831336 = queryNorm
              0.9036538 = fieldWeight in 1700, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.15625 = fieldNorm(doc=1700)
          0.14890052 = weight(abstract_txt:project in 1700) [ClassicSimilarity], result of:
            0.14890052 = score(doc=1700,freq=1.0), product of:
              0.21765365 = queryWeight, product of:
                2.001961 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.024831336 = queryNorm
              0.68411684 = fieldWeight in 1700, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.15625 = fieldNorm(doc=1700)
          0.35247394 = weight(abstract_txt:tables in 1700) [ClassicSimilarity], result of:
            0.35247394 = score(doc=1700,freq=1.0), product of:
              0.33771944 = queryWeight, product of:
                2.0361278 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.024831336 = queryNorm
              1.0436887 = fieldWeight in 1700, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.15625 = fieldNorm(doc=1700)
        0.2 = coord(5/25)
    
  4. Miksa, S.D.; Moen, WE.; Snyder, G.; Polyakov, S.; Eklund, A.: Metadata assistance of the Functional Requirements for Bibliographic Record's four user tasks : a report on the MARC content designation utilization (MCDU) project (2006) 0.16
    0.15579826 = sum of:
      0.15579826 = product of:
        0.55642235 = sum of:
          0.025362683 = weight(abstract_txt:used in 125) [ClassicSimilarity], result of:
            0.025362683 = score(doc=125,freq=2.0), product of:
              0.08541842 = queryWeight, product of:
                1.0240066 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.024831336 = queryNorm
              0.2969229 = fieldWeight in 125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.019648813 = weight(abstract_txt:using in 125) [ClassicSimilarity], result of:
            0.019648813 = score(doc=125,freq=1.0), product of:
              0.09077974 = queryWeight, product of:
                1.0556537 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.024831336 = queryNorm
              0.21644491 = fieldWeight in 125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.10123979 = weight(abstract_txt:bibliographic in 125) [ClassicSimilarity], result of:
            0.10123979 = score(doc=125,freq=8.0), product of:
              0.13540527 = queryWeight, product of:
                1.2892727 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.024831336 = queryNorm
              0.7476798 = fieldWeight in 125, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.09132292 = weight(abstract_txt:records in 125) [ClassicSimilarity], result of:
            0.09132292 = score(doc=125,freq=5.0), product of:
              0.14785224 = queryWeight, product of:
                1.3472276 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.024831336 = queryNorm
              0.61766344 = fieldWeight in 125, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.07694298 = weight(abstract_txt:american in 125) [ClassicSimilarity], result of:
            0.07694298 = score(doc=125,freq=1.0), product of:
              0.22553332 = queryWeight, product of:
                1.6639197 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.024831336 = queryNorm
              0.34116015 = fieldWeight in 125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.12278474 = weight(abstract_txt:museum in 125) [ClassicSimilarity], result of:
            0.12278474 = score(doc=125,freq=1.0), product of:
              0.30798364 = queryWeight, product of:
                1.9444234 = boost
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.024831336 = queryNorm
              0.39867294 = fieldWeight in 125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
          0.11912043 = weight(abstract_txt:project in 125) [ClassicSimilarity], result of:
            0.11912043 = score(doc=125,freq=4.0), product of:
              0.21765365 = queryWeight, product of:
                2.001961 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.024831336 = queryNorm
              0.5472935 = fieldWeight in 125, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.0625 = fieldNorm(doc=125)
        0.28 = coord(7/25)
    
  5. LaBarre, K.A.; Tilley, C.L.: ¬The elusive tale : leveraging the study of information seeking and knowledge organization to improve access to and discovery of folktales (2012) 0.15
    0.15189 = sum of:
      0.15189 = product of:
        0.47465622 = sum of:
          0.017934127 = weight(abstract_txt:used in 48) [ClassicSimilarity], result of:
            0.017934127 = score(doc=48,freq=1.0), product of:
              0.08541842 = queryWeight, product of:
                1.0240066 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.024831336 = queryNorm
              0.2099562 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.040505655 = weight(abstract_txt:were in 48) [ClassicSimilarity], result of:
            0.040505655 = score(doc=48,freq=3.0), product of:
              0.101953335 = queryWeight, product of:
                1.1187363 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.024831336 = queryNorm
              0.39729604 = fieldWeight in 48, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.035793673 = weight(abstract_txt:bibliographic in 48) [ClassicSimilarity], result of:
            0.035793673 = score(doc=48,freq=1.0), product of:
              0.13540527 = queryWeight, product of:
                1.2892727 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.024831336 = queryNorm
              0.26434475 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.040840853 = weight(abstract_txt:records in 48) [ClassicSimilarity], result of:
            0.040840853 = score(doc=48,freq=1.0), product of:
              0.14785224 = queryWeight, product of:
                1.3472276 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.024831336 = queryNorm
              0.27622747 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.04751962 = weight(abstract_txt:collection in 48) [ClassicSimilarity], result of:
            0.04751962 = score(doc=48,freq=1.0), product of:
              0.1635611 = queryWeight, product of:
                1.4169908 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.024831336 = queryNorm
              0.2905313 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.09151249 = weight(abstract_txt:contents in 48) [ClassicSimilarity], result of:
            0.09151249 = score(doc=48,freq=1.0), product of:
              0.25317353 = queryWeight, product of:
                1.7629344 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.024831336 = queryNorm
              0.36146152 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.059560213 = weight(abstract_txt:project in 48) [ClassicSimilarity], result of:
            0.059560213 = score(doc=48,freq=1.0), product of:
              0.21765365 = queryWeight, product of:
                2.001961 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.024831336 = queryNorm
              0.27364674 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
          0.14098959 = weight(abstract_txt:tables in 48) [ClassicSimilarity], result of:
            0.14098959 = score(doc=48,freq=1.0), product of:
              0.33771944 = queryWeight, product of:
                2.0361278 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.024831336 = queryNorm
              0.41747546 = fieldWeight in 48, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.0625 = fieldNorm(doc=48)
        0.32 = coord(8/25)