Document (#44186)

Author
Zakaria, M.S.
Title
Measuring typographical errors in online catalogs of academic libraries using Ballard's list : a case study from Egypt
Source
Cataloging and classification quarterly. 61(2023) no.7-8, S.848-870
Year
2023
Abstract
Typographical errors in bibliographic records of online library catalogs are a common troublesome phenomenon, spread all over the world. They can affect the retrieval and identification of items in information retrieval systems and thus prevent users from finding the documents they need. The present study was conducted to measure typographical errors in the online catalog of the Egyptian Universities Libraries Consortium (EULC). The investigation depended on Terry Ballard's typographical error terms list. The EULC catalog was searched to identify matched erroneous records. The study found that the total number of erroneous records reached 1686, whereas the mean error rate for each record is 11.24, which is very high. About 396 erroneous records (23.49%) have been retrieved from Section C of Ballard's list (Moderate Probability). The typographical errors found within the abstracts of the study's sample records represented 35.82%. Omissions were the first common type of errors with 54.51%, followed by transpositions at 17.08%. Regarding the analysis of parts of speech, the study found that 63.46% of errors occur in noun terms. The results of the study indicated that typographical errors still pose a serious challenge for information retrieval systems, especially for library systems in the Arab environment. The study proposes some solutions for Egyptian university libraries in order to avoid typographic mistakes in the future.
Content
Vgl.: https://www.tandfonline.com/doi/full/10.1080/01639374.2023.2282579.
Theme
Formalerschließung
Location
Ägypten
Aid
Ballard-Liste

Similar documents (content)

  1. Beall, J.; Kafadar, K.: ¬The effectiveness of copy cotaloging at eliminating typographical errors in shared bibliographic records (2004) 0.78
    0.77934647 = sum of:
      0.77934647 = product of:
        1.9483662 = sum of:
          0.016442155 = weight(abstract_txt:retrieval in 4849) [ClassicSimilarity], result of:
            0.016442155 = score(doc=4849,freq=1.0), product of:
              0.050467905 = queryWeight, product of:
                1.2890408 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.011266172 = queryNorm
              0.3257943 = fieldWeight in 4849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.01930517 = weight(abstract_txt:online in 4849) [ClassicSimilarity], result of:
            0.01930517 = score(doc=4849,freq=1.0), product of:
              0.05616837 = queryWeight, product of:
                1.3598937 = boost
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.011266172 = queryNorm
              0.34370178 = fieldWeight in 4849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.02998751 = weight(abstract_txt:libraries in 4849) [ClassicSimilarity], result of:
            0.02998751 = score(doc=4849,freq=2.0), product of:
              0.059794288 = queryWeight, product of:
                1.4031008 = boost
                3.782635 = idf(docFreq=2735, maxDocs=44218)
                0.011266172 = queryNorm
              0.5015113 = fieldWeight in 4849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.782635 = idf(docFreq=2735, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.08243342 = weight(abstract_txt:catalogs in 4849) [ClassicSimilarity], result of:
            0.08243342 = score(doc=4849,freq=2.0), product of:
              0.1025033 = queryWeight, product of:
                1.4999691 = boost
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.011266172 = queryNorm
              0.80420256 = fieldWeight in 4849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.0351621 = weight(abstract_txt:found in 4849) [ClassicSimilarity], result of:
            0.0351621 = score(doc=4849,freq=1.0), product of:
              0.08377079 = queryWeight, product of:
                1.6607541 = boost
                4.4772453 = idf(docFreq=1365, maxDocs=44218)
                0.011266172 = queryNorm
              0.41974175 = fieldWeight in 4849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4772453 = idf(docFreq=1365, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.08404967 = weight(abstract_txt:error in 4849) [ClassicSimilarity], result of:
            0.08404967 = score(doc=4849,freq=1.0), product of:
              0.1308287 = queryWeight, product of:
                1.694591 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.011266172 = queryNorm
              0.6424407 = fieldWeight in 4849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.044474848 = weight(abstract_txt:study in 4849) [ClassicSimilarity], result of:
            0.044474848 = score(doc=4849,freq=2.0), product of:
              0.09797586 = queryWeight, product of:
                2.539999 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011266172 = queryNorm
              0.45393682 = fieldWeight in 4849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.09763646 = weight(abstract_txt:records in 4849) [ClassicSimilarity], result of:
            0.09763646 = score(doc=4849,freq=3.0), product of:
              0.13604835 = queryWeight, product of:
                2.7323105 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.011266172 = queryNorm
              0.71766 = fieldWeight in 4849, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.57425785 = weight(abstract_txt:errors in 4849) [ClassicSimilarity], result of:
            0.57425785 = score(doc=4849,freq=5.0), product of:
              0.4182632 = queryWeight, product of:
                5.6685534 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.011266172 = queryNorm
              1.3729582 = fieldWeight in 4849, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
          0.9646171 = weight(abstract_txt:typographical in 4849) [ClassicSimilarity], result of:
            0.9646171 = score(doc=4849,freq=3.0), product of:
              0.66565466 = queryWeight, product of:
                6.620618 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.011266172 = queryNorm
              1.4491254 = fieldWeight in 4849, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.09375 = fieldNorm(doc=4849)
        0.4 = coord(10/25)
    
  2. Beall, J.; Kafadar, K.: Measuring typographical errors' impact on retrieval in bibliographic databases (2007) 0.51
    0.5054072 = sum of:
      0.5054072 = product of:
        1.4039088 = sum of:
          0.006893181 = weight(abstract_txt:from in 261) [ClassicSimilarity], result of:
            0.006893181 = score(doc=261,freq=1.0), product of:
              0.031923465 = queryWeight, product of:
                1.0252129 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011266172 = queryNorm
              0.21592833 = fieldWeight in 261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.0137017965 = weight(abstract_txt:retrieval in 261) [ClassicSimilarity], result of:
            0.0137017965 = score(doc=261,freq=1.0), product of:
              0.050467905 = queryWeight, product of:
                1.2890408 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.011266172 = queryNorm
              0.27149525 = fieldWeight in 261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.01608764 = weight(abstract_txt:online in 261) [ClassicSimilarity], result of:
            0.01608764 = score(doc=261,freq=1.0), product of:
              0.05616837 = queryWeight, product of:
                1.3598937 = boost
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.011266172 = queryNorm
              0.28641814 = fieldWeight in 261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.048574354 = weight(abstract_txt:catalogs in 261) [ClassicSimilarity], result of:
            0.048574354 = score(doc=261,freq=1.0), product of:
              0.1025033 = queryWeight, product of:
                1.4999691 = boost
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.011266172 = queryNorm
              0.4738809 = fieldWeight in 261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.07004139 = weight(abstract_txt:error in 261) [ClassicSimilarity], result of:
            0.07004139 = score(doc=261,freq=1.0), product of:
              0.1308287 = queryWeight, product of:
                1.694591 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.011266172 = queryNorm
              0.5353672 = fieldWeight in 261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.037062377 = weight(abstract_txt:study in 261) [ClassicSimilarity], result of:
            0.037062377 = score(doc=261,freq=2.0), product of:
              0.09797586 = queryWeight, product of:
                2.539999 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011266172 = queryNorm
              0.3782807 = fieldWeight in 261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.10504011 = weight(abstract_txt:records in 261) [ClassicSimilarity], result of:
            0.10504011 = score(doc=261,freq=5.0), product of:
              0.13604835 = queryWeight, product of:
                2.7323105 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.011266172 = queryNorm
              0.7720793 = fieldWeight in 261, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.30266044 = weight(abstract_txt:errors in 261) [ClassicSimilarity], result of:
            0.30266044 = score(doc=261,freq=2.0), product of:
              0.4182632 = queryWeight, product of:
                5.6685534 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.011266172 = queryNorm
              0.7236124 = fieldWeight in 261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
          0.80384755 = weight(abstract_txt:typographical in 261) [ClassicSimilarity], result of:
            0.80384755 = score(doc=261,freq=3.0), product of:
              0.66565466 = queryWeight, product of:
                6.620618 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.011266172 = queryNorm
              1.2076045 = fieldWeight in 261, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.078125 = fieldNorm(doc=261)
        0.36 = coord(9/25)
    
  3. Ojala, M.: Troubleshooting your search : whatever can go wrong, will go wrong (1995) 0.27
    0.26755282 = sum of:
      0.26755282 = product of:
        1.6722052 = sum of:
          0.025740225 = weight(abstract_txt:online in 4077) [ClassicSimilarity], result of:
            0.025740225 = score(doc=4077,freq=1.0), product of:
              0.05616837 = queryWeight, product of:
                1.3598937 = boost
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.011266172 = queryNorm
              0.45826903 = fieldWeight in 4077, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6661522 = idf(docFreq=3073, maxDocs=44218)
                0.125 = fieldNorm(doc=4077)
          0.112066224 = weight(abstract_txt:error in 4077) [ClassicSimilarity], result of:
            0.112066224 = score(doc=4077,freq=1.0), product of:
              0.1308287 = queryWeight, product of:
                1.694591 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.011266172 = queryNorm
              0.8565875 = fieldWeight in 4077, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.125 = fieldNorm(doc=4077)
          0.48425674 = weight(abstract_txt:errors in 4077) [ClassicSimilarity], result of:
            0.48425674 = score(doc=4077,freq=2.0), product of:
              0.4182632 = queryWeight, product of:
                5.6685534 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.011266172 = queryNorm
              1.1577799 = fieldWeight in 4077, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.125 = fieldNorm(doc=4077)
          1.050142 = weight(abstract_txt:typographical in 4077) [ClassicSimilarity], result of:
            1.050142 = score(doc=4077,freq=2.0), product of:
              0.66565466 = queryWeight, product of:
                6.620618 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.011266172 = queryNorm
              1.577608 = fieldWeight in 4077, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.125 = fieldNorm(doc=4077)
        0.16 = coord(4/25)
    
  4. Zeng, L.: Quality control of Chinese-language records using a rule-based data validation system : Part 1: an evaluation of the quality of Chinese-language records in the OCLC OLUC database (1993) 0.22
    0.2230752 = sum of:
      0.2230752 = product of:
        1.115376 = sum of:
          0.006893181 = weight(abstract_txt:from in 580) [ClassicSimilarity], result of:
            0.006893181 = score(doc=580,freq=1.0), product of:
              0.031923465 = queryWeight, product of:
                1.0252129 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011266172 = queryNorm
              0.21592833 = fieldWeight in 580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.078125 = fieldNorm(doc=580)
          0.026207058 = weight(abstract_txt:study in 580) [ClassicSimilarity], result of:
            0.026207058 = score(doc=580,freq=1.0), product of:
              0.09797586 = queryWeight, product of:
                2.539999 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011266172 = queryNorm
              0.26748484 = fieldWeight in 580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.078125 = fieldNorm(doc=580)
          0.093950726 = weight(abstract_txt:records in 580) [ClassicSimilarity], result of:
            0.093950726 = score(doc=580,freq=4.0), product of:
              0.13604835 = queryWeight, product of:
                2.7323105 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.011266172 = queryNorm
              0.6905687 = fieldWeight in 580, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.078125 = fieldNorm(doc=580)
          0.5242233 = weight(abstract_txt:errors in 580) [ClassicSimilarity], result of:
            0.5242233 = score(doc=580,freq=6.0), product of:
              0.4182632 = queryWeight, product of:
                5.6685534 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.011266172 = queryNorm
              1.2533337 = fieldWeight in 580, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.078125 = fieldNorm(doc=580)
          0.4641016 = weight(abstract_txt:typographical in 580) [ClassicSimilarity], result of:
            0.4641016 = score(doc=580,freq=1.0), product of:
              0.66565466 = queryWeight, product of:
                6.620618 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.011266172 = queryNorm
              0.6972108 = fieldWeight in 580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.078125 = fieldNorm(doc=580)
        0.2 = coord(5/25)
    
  5. Shin, H.-s.: Quality of Korean cataloging records in shared databases (2003) 0.20
    0.20401335 = sum of:
      0.20401335 = product of:
        0.7286191 = sum of:
          0.01439329 = weight(abstract_txt:terms in 5498) [ClassicSimilarity], result of:
            0.01439329 = score(doc=5498,freq=1.0), product of:
              0.045558896 = queryWeight, product of:
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011266172 = queryNorm
              0.3159271 = fieldWeight in 5498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.0137017965 = weight(abstract_txt:retrieval in 5498) [ClassicSimilarity], result of:
            0.0137017965 = score(doc=5498,freq=1.0), product of:
              0.050467905 = queryWeight, product of:
                1.2890408 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.011266172 = queryNorm
              0.27149525 = fieldWeight in 5498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.029301748 = weight(abstract_txt:found in 5498) [ClassicSimilarity], result of:
            0.029301748 = score(doc=5498,freq=1.0), product of:
              0.08377079 = queryWeight, product of:
                1.6607541 = boost
                4.4772453 = idf(docFreq=1365, maxDocs=44218)
                0.011266172 = queryNorm
              0.3497848 = fieldWeight in 5498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4772453 = idf(docFreq=1365, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.14008278 = weight(abstract_txt:error in 5498) [ClassicSimilarity], result of:
            0.14008278 = score(doc=5498,freq=4.0), product of:
              0.1308287 = queryWeight, product of:
                1.694591 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.011266172 = queryNorm
              1.0707344 = fieldWeight in 5498, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.045391954 = weight(abstract_txt:study in 5498) [ClassicSimilarity], result of:
            0.045391954 = score(doc=5498,freq=3.0), product of:
              0.09797586 = queryWeight, product of:
                2.539999 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011266172 = queryNorm
              0.46329734 = fieldWeight in 5498, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.11506568 = weight(abstract_txt:records in 5498) [ClassicSimilarity], result of:
            0.11506568 = score(doc=5498,freq=6.0), product of:
              0.13604835 = queryWeight, product of:
                2.7323105 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.011266172 = queryNorm
              0.8457705 = fieldWeight in 5498, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
          0.37068185 = weight(abstract_txt:errors in 5498) [ClassicSimilarity], result of:
            0.37068185 = score(doc=5498,freq=3.0), product of:
              0.4182632 = queryWeight, product of:
                5.6685534 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.011266172 = queryNorm
              0.88624066 = fieldWeight in 5498, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.078125 = fieldNorm(doc=5498)
        0.28 = coord(7/25)