Document (#18772)

Author
Prasad, A.R.D.
Title
Application of OCR in building bibliographic databases
Source
DESIDOC bulletin of information technology. 17(1997) no.4, S.17-19
Year
1997
Abstract
Bibliographic databases tend to be very verbose and pose a problem for libraries due to the huge amount of data entry involved. In this situation, technologies that offer solutions are retrospective conversion and OCR. Discusses the building of an intelligent system for the automatic identification of bibliographic elements such as title, author, publisher, etc. Considers the resolution of conflicts in situations where more than one bibliographic element satisfies the criteria specified for identification. This work is being carried out at the Indian Documentation Research and Training Centre, Bangalore, with the financial assistance of NISSAT (National Information System for Science and Technology)
Footnote
Contribution to the first in a series of special issues of this journal focusing on Indian bibliographic databases

Similar documents (author)

  1. Prasad, A.R.D.: PROMETHEUS: an automatic indexing system (1996) 7.99
    7.98839 = sum of:
      7.98839 = sum of:
        3.7635267 = weight(author_txt:prasad in 5189) [ClassicSimilarity], result of:
          3.7635267 = score(doc=5189,freq=1.0), product of:
            0.67936194 = queryWeight, product of:
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.076645635 = queryNorm
            5.5397964 = fieldWeight in 5189, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.625 = fieldNorm(doc=5189)
        4.2248635 = weight(author_txt:a.r.d in 5189) [ClassicSimilarity], result of:
          4.2248635 = score(doc=5189,freq=1.0), product of:
            0.7338033 = queryWeight, product of:
              1.0392959 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.076645635 = queryNorm
            5.7574883 = fieldWeight in 5189, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.625 = fieldNorm(doc=5189)
    
  2. Prasad, A.R.D.; Kar, B.B.: Parsing Boolean search expression using definite clause grammars (1994) 6.39
    6.3907123 = sum of:
      6.3907123 = sum of:
        3.0108213 = weight(author_txt:prasad in 8188) [ClassicSimilarity], result of:
          3.0108213 = score(doc=8188,freq=1.0), product of:
            0.67936194 = queryWeight, product of:
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.076645635 = queryNorm
            4.431837 = fieldWeight in 8188, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.5 = fieldNorm(doc=8188)
        3.379891 = weight(author_txt:a.r.d in 8188) [ClassicSimilarity], result of:
          3.379891 = score(doc=8188,freq=1.0), product of:
            0.7338033 = queryWeight, product of:
              1.0392959 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.076645635 = queryNorm
            4.6059904 = fieldWeight in 8188, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=8188)
    
  3. Karisiddappa, C.R.; Prasad, A.R.D.: Declarative programming and thesaurus construction (1993) 6.39
    6.3907123 = sum of:
      6.3907123 = sum of:
        3.0108213 = weight(author_txt:prasad in 3217) [ClassicSimilarity], result of:
          3.0108213 = score(doc=3217,freq=1.0), product of:
            0.67936194 = queryWeight, product of:
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.076645635 = queryNorm
            4.431837 = fieldWeight in 3217, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.5 = fieldNorm(doc=3217)
        3.379891 = weight(author_txt:a.r.d in 3217) [ClassicSimilarity], result of:
          3.379891 = score(doc=3217,freq=1.0), product of:
            0.7338033 = queryWeight, product of:
              1.0392959 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.076645635 = queryNorm
            4.6059904 = fieldWeight in 3217, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=3217)
    
  4. Mundgod, M.B.; Prasad, A.R.D.: Automatic identification of bibliographic data elements from the title pages of documents : a heuristic approach (1996) 6.39
    6.3907123 = sum of:
      6.3907123 = sum of:
        3.0108213 = weight(author_txt:prasad in 397) [ClassicSimilarity], result of:
          3.0108213 = score(doc=397,freq=1.0), product of:
            0.67936194 = queryWeight, product of:
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.076645635 = queryNorm
            4.431837 = fieldWeight in 397, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.5 = fieldNorm(doc=397)
        3.379891 = weight(author_txt:a.r.d in 397) [ClassicSimilarity], result of:
          3.379891 = score(doc=397,freq=1.0), product of:
            0.7338033 = queryWeight, product of:
              1.0392959 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.076645635 = queryNorm
            4.6059904 = fieldWeight in 397, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=397)
    
  5. Prasad, A.R.D.; Madalli, D.P.: Faceted infrastructure for semantic digital libraries (2008) 6.39
    6.3907123 = sum of:
      6.3907123 = sum of:
        3.0108213 = weight(author_txt:prasad in 1905) [ClassicSimilarity], result of:
          3.0108213 = score(doc=1905,freq=1.0), product of:
            0.67936194 = queryWeight, product of:
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.076645635 = queryNorm
            4.431837 = fieldWeight in 1905, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.863674 = idf(docFreq=16, maxDocs=44218)
              0.5 = fieldNorm(doc=1905)
        3.379891 = weight(author_txt:a.r.d in 1905) [ClassicSimilarity], result of:
          3.379891 = score(doc=1905,freq=1.0), product of:
            0.7338033 = queryWeight, product of:
              1.0392959 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.076645635 = queryNorm
            4.6059904 = fieldWeight in 1905, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=1905)
    

Similar documents (content)

  1. Hariharan, A.; Rao, B.R.K.; Somaiah, M.S.: Design and development of a database on micro-CDS/ISIS : union catalogue of the S&T conference proceedings (1991) 0.10
    0.10044898 = sum of:
      0.10044898 = product of:
        0.6278061 = sum of:
          0.08207368 = weight(abstract_txt:centre in 511) [ClassicSimilarity], result of:
            0.08207368 = score(doc=511,freq=1.0), product of:
              0.14054854 = queryWeight, product of:
                1.0534012 = boost
                6.228827 = idf(docFreq=236, maxDocs=44218)
                0.021420335 = queryNorm
              0.58395255 = fieldWeight in 511, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.228827 = idf(docFreq=236, maxDocs=44218)
                0.09375 = fieldNorm(doc=511)
          0.026049353 = weight(abstract_txt:system in 511) [ClassicSimilarity], result of:
            0.026049353 = score(doc=511,freq=1.0), product of:
              0.08239453 = queryWeight, product of:
                1.1406301 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.021420335 = queryNorm
              0.3161539 = fieldWeight in 511, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=511)
          0.16276175 = weight(abstract_txt:indian in 511) [ClassicSimilarity], result of:
            0.16276175 = score(doc=511,freq=1.0), product of:
              0.2218496 = queryWeight, product of:
                1.3234575 = boost
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.021420335 = queryNorm
              0.7336581 = fieldWeight in 511, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.09375 = fieldNorm(doc=511)
          0.35692134 = weight(abstract_txt:bangalore in 511) [ClassicSimilarity], result of:
            0.35692134 = score(doc=511,freq=2.0), product of:
              0.2972091 = queryWeight, product of:
                1.531834 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.021420335 = queryNorm
              1.2009099 = fieldWeight in 511, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.09375 = fieldNorm(doc=511)
        0.16 = coord(4/25)
    
  2. Mundgod, M.B.; Prasad, A.R.D.: Automatic identification of bibliographic data elements from the title pages of documents : a heuristic approach (1996) 0.10
    0.09962604 = sum of:
      0.09962604 = product of:
        0.4981302 = sum of:
          0.070213795 = weight(abstract_txt:entry in 397) [ClassicSimilarity], result of:
            0.070213795 = score(doc=397,freq=1.0), product of:
              0.12665978 = queryWeight, product of:
                5.913062 = idf(docFreq=324, maxDocs=44218)
                0.021420335 = queryNorm
              0.55434954 = fieldWeight in 397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.913062 = idf(docFreq=324, maxDocs=44218)
                0.09375 = fieldNorm(doc=397)
          0.045118805 = weight(abstract_txt:system in 397) [ClassicSimilarity], result of:
            0.045118805 = score(doc=397,freq=3.0), product of:
              0.08239453 = queryWeight, product of:
                1.1406301 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.021420335 = queryNorm
              0.54759467 = fieldWeight in 397, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=397)
          0.13028185 = weight(abstract_txt:publisher in 397) [ClassicSimilarity], result of:
            0.13028185 = score(doc=397,freq=1.0), product of:
              0.19125511 = queryWeight, product of:
                1.2288169 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.021420335 = queryNorm
              0.68119407 = fieldWeight in 397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.09375 = fieldNorm(doc=397)
          0.10716005 = weight(abstract_txt:building in 397) [ClassicSimilarity], result of:
            0.10716005 = score(doc=397,freq=1.0), product of:
              0.21153831 = queryWeight, product of:
                1.8276379 = boost
                5.403468 = idf(docFreq=540, maxDocs=44218)
                0.021420335 = queryNorm
              0.5065751 = fieldWeight in 397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.403468 = idf(docFreq=540, maxDocs=44218)
                0.09375 = fieldNorm(doc=397)
          0.14535572 = weight(abstract_txt:bibliographic in 397) [ClassicSimilarity], result of:
            0.14535572 = score(doc=397,freq=2.0), product of:
              0.25921205 = queryWeight, product of:
                2.8611343 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.021420335 = queryNorm
              0.5607599 = fieldWeight in 397, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.09375 = fieldNorm(doc=397)
        0.2 = coord(5/25)
    
  3. Jakac-Bizjak, V.: Planning the national electronic library in Slovenia (1998) 0.10
    0.09955212 = sum of:
      0.09955212 = product of:
        0.4148005 = sum of:
          0.06839473 = weight(abstract_txt:centre in 5194) [ClassicSimilarity], result of:
            0.06839473 = score(doc=5194,freq=1.0), product of:
              0.14054854 = queryWeight, product of:
                1.0534012 = boost
                6.228827 = idf(docFreq=236, maxDocs=44218)
                0.021420335 = queryNorm
              0.4866271 = fieldWeight in 5194, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.228827 = idf(docFreq=236, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
          0.069242895 = weight(abstract_txt:conversion in 5194) [ClassicSimilarity], result of:
            0.069242895 = score(doc=5194,freq=1.0), product of:
              0.14170812 = queryWeight, product of:
                1.0577378 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.021420335 = queryNorm
              0.4886304 = fieldWeight in 5194, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
          0.03069946 = weight(abstract_txt:system in 5194) [ClassicSimilarity], result of:
            0.03069946 = score(doc=5194,freq=2.0), product of:
              0.08239453 = queryWeight, product of:
                1.1406301 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.021420335 = queryNorm
              0.372591 = fieldWeight in 5194, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
          0.09202931 = weight(abstract_txt:retrospective in 5194) [ClassicSimilarity], result of:
            0.09202931 = score(doc=5194,freq=1.0), product of:
              0.17130184 = queryWeight, product of:
                1.1629517 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.021420335 = queryNorm
              0.5372348 = fieldWeight in 5194, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
          0.068782434 = weight(abstract_txt:databases in 5194) [ClassicSimilarity], result of:
            0.068782434 = score(doc=5194,freq=2.0), product of:
              0.14107919 = queryWeight, product of:
                1.4925439 = boost
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.021420335 = queryNorm
              0.48754486 = fieldWeight in 5194, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
          0.08565167 = weight(abstract_txt:bibliographic in 5194) [ClassicSimilarity], result of:
            0.08565167 = score(doc=5194,freq=1.0), product of:
              0.25921205 = queryWeight, product of:
                2.8611343 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.021420335 = queryNorm
              0.33043092 = fieldWeight in 5194, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.078125 = fieldNorm(doc=5194)
        0.24 = coord(6/25)
    
  4. McGarry, D.: Displays of bibliographic records in call number order : functions of the displays and data elements needed (1992) 0.09
    0.087853305 = sum of:
      0.087853305 = product of:
        0.4392665 = sum of:
          0.0661982 = weight(abstract_txt:entry in 2384) [ClassicSimilarity], result of:
            0.0661982 = score(doc=2384,freq=2.0), product of:
              0.12665978 = queryWeight, product of:
                5.913062 = idf(docFreq=324, maxDocs=44218)
                0.021420335 = queryNorm
              0.5226458 = fieldWeight in 2384, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.913062 = idf(docFreq=324, maxDocs=44218)
                0.0625 = fieldNorm(doc=2384)
          0.058763016 = weight(abstract_txt:element in 2384) [ClassicSimilarity], result of:
            0.058763016 = score(doc=2384,freq=1.0), product of:
              0.14739655 = queryWeight, product of:
                1.0787587 = boost
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.021420335 = queryNorm
              0.39867294 = fieldWeight in 2384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.378767 = idf(docFreq=203, maxDocs=44218)
                0.0625 = fieldNorm(doc=2384)
          0.08685457 = weight(abstract_txt:publisher in 2384) [ClassicSimilarity], result of:
            0.08685457 = score(doc=2384,freq=1.0), product of:
              0.19125511 = queryWeight, product of:
                1.2288169 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.021420335 = queryNorm
              0.4541294 = fieldWeight in 2384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=2384)
          0.09040805 = weight(abstract_txt:identification in 2384) [ClassicSimilarity], result of:
            0.09040805 = score(doc=2384,freq=1.0), product of:
              0.24749476 = queryWeight, product of:
                1.9768724 = boost
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.021420335 = queryNorm
              0.3652928 = fieldWeight in 2384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.0625 = fieldNorm(doc=2384)
          0.13704269 = weight(abstract_txt:bibliographic in 2384) [ClassicSimilarity], result of:
            0.13704269 = score(doc=2384,freq=4.0), product of:
              0.25921205 = queryWeight, product of:
                2.8611343 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.021420335 = queryNorm
              0.5286895 = fieldWeight in 2384, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.0625 = fieldNorm(doc=2384)
        0.2 = coord(5/25)
    
  5. VanAvery, A.R.: Recat vs. Recon of serials : a problem for shared cataloging (1990) 0.09
    0.085422136 = sum of:
      0.085422136 = product of:
        0.53388834 = sum of:
          0.110788636 = weight(abstract_txt:conversion in 473) [ClassicSimilarity], result of:
            0.110788636 = score(doc=473,freq=1.0), product of:
              0.14170812 = queryWeight, product of:
                1.0577378 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.021420335 = queryNorm
              0.7818087 = fieldWeight in 473, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.125 = fieldNorm(doc=473)
          0.20823857 = weight(abstract_txt:retrospective in 473) [ClassicSimilarity], result of:
            0.20823857 = score(doc=473,freq=2.0), product of:
              0.17130184 = queryWeight, product of:
                1.1629517 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.021420335 = queryNorm
              1.2156236 = fieldWeight in 473, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.125 = fieldNorm(doc=473)
          0.077818446 = weight(abstract_txt:databases in 473) [ClassicSimilarity], result of:
            0.077818446 = score(doc=473,freq=1.0), product of:
              0.14107919 = queryWeight, product of:
                1.4925439 = boost
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.021420335 = queryNorm
              0.5515941 = fieldWeight in 473, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.125 = fieldNorm(doc=473)
          0.13704269 = weight(abstract_txt:bibliographic in 473) [ClassicSimilarity], result of:
            0.13704269 = score(doc=473,freq=1.0), product of:
              0.25921205 = queryWeight, product of:
                2.8611343 = boost
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.021420335 = queryNorm
              0.5286895 = fieldWeight in 473, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.229516 = idf(docFreq=1749, maxDocs=44218)
                0.125 = fieldNorm(doc=473)
        0.16 = coord(4/25)