Document (#19144)

Author
Brooks, T.A.
Title
Orthography as a fundamental impediment to online information retrieval
Source
Journal of the American Society for Information Science. 49(1998) no.8, S.731-741
Year
1998
Abstract
Orthography is the linguisitc study of written language: elements of text such as letters, punctuation marks, and spelling. Information retrieval systems operate in the orthographic realm matching some text strings (i.e., index entries) from documents with other text strings (i.e., query terms) from patrons. During the early history of information retrieval, it has been convenient to assume the rationality and uniformity of orthography in order to concentrate effort building information retrieval systems. Fundamental orthographic problems have persisted into modern information retrieval systems, however, where white-space normalization and the arbitrary treatment of punctuation have exaverbated the orthographic impediment to information retrieval

Similar documents (author)

  1. Brooks, T.A.: ¬The model of science and scientific models in librarianship (1989) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:brooks in 418) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 418, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=418)
    
  2. Brooks, T.A.: Private acts and public objects : an investigation of citer motivations (1985) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:brooks in 649) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=649)
    
  3. Brooks, T.A.: Evidence of complex citer motivation (1986) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:brooks in 650) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 650, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=650)
    
  4. Brooks, D.: System-system interaction in computerized indexing of visual materials : a selected review (1988) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:brooks in 656) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 656, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=656)
    
  5. Brooks, L.: Nonanalytic concept formation and memory for instances (1978) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:brooks in 794) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 794, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=794)
    

Similar documents (content)

  1. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.12
    0.11761817 = sum of:
      0.11761817 = product of:
        0.49007574 = sum of:
          0.011034557 = weight(abstract_txt:have in 2541) [ClassicSimilarity], result of:
            0.011034557 = score(doc=2541,freq=2.0), product of:
              0.04452232 = queryWeight, product of:
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.013893246 = queryNorm
              0.24784327 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
          0.09206314 = weight(abstract_txt:spelling in 2541) [ClassicSimilarity], result of:
            0.09206314 = score(doc=2541,freq=3.0), product of:
              0.12698662 = queryWeight, product of:
                1.1941946 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.013893246 = queryNorm
              0.724983 = fieldWeight in 2541, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
          0.014125036 = weight(abstract_txt:systems in 2541) [ClassicSimilarity], result of:
            0.014125036 = score(doc=2541,freq=1.0), product of:
              0.07570211 = queryWeight, product of:
                1.5970213 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.013893246 = queryNorm
              0.1865871 = fieldWeight in 2541, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
          0.014272811 = weight(abstract_txt:information in 2541) [ClassicSimilarity], result of:
            0.014272811 = score(doc=2541,freq=2.0), product of:
              0.076229185 = queryWeight, product of:
                2.2663782 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.013893246 = queryNorm
              0.18723552 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
          0.04221537 = weight(abstract_txt:retrieval in 2541) [ClassicSimilarity], result of:
            0.04221537 = score(doc=2541,freq=2.0), product of:
              0.15707076 = queryWeight, product of:
                3.2532647 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013893246 = queryNorm
              0.26876658 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
          0.31636482 = weight(abstract_txt:orthographic in 2541) [ClassicSimilarity], result of:
            0.31636482 = score(doc=2541,freq=1.0), product of:
              0.6015066 = queryWeight, product of:
                4.5017037 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.013893246 = queryNorm
              0.52595407 = fieldWeight in 2541, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2541)
        0.24 = coord(6/25)
    
  2. Levow, G.-A.; Oard, D.W.; Resnik, P.: Dictionary-based techniques for cross-language information retrieval (2005) 0.11
    0.108990744 = sum of:
      0.108990744 = product of:
        0.5449537 = sum of:
          0.015763652 = weight(abstract_txt:have in 1025) [ClassicSimilarity], result of:
            0.015763652 = score(doc=1025,freq=2.0), product of:
              0.04452232 = queryWeight, product of:
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.013893246 = queryNorm
              0.35406178 = fieldWeight in 1025, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.078125 = fieldNorm(doc=1025)
          0.020178623 = weight(abstract_txt:systems in 1025) [ClassicSimilarity], result of:
            0.020178623 = score(doc=1025,freq=1.0), product of:
              0.07570211 = queryWeight, product of:
                1.5970213 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.013893246 = queryNorm
              0.26655298 = fieldWeight in 1025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.078125 = fieldNorm(doc=1025)
          0.014417716 = weight(abstract_txt:information in 1025) [ClassicSimilarity], result of:
            0.014417716 = score(doc=1025,freq=1.0), product of:
              0.076229185 = queryWeight, product of:
                2.2663782 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.013893246 = queryNorm
              0.18913643 = fieldWeight in 1025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=1025)
          0.042643964 = weight(abstract_txt:retrieval in 1025) [ClassicSimilarity], result of:
            0.042643964 = score(doc=1025,freq=1.0), product of:
              0.15707076 = queryWeight, product of:
                3.2532647 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013893246 = queryNorm
              0.27149525 = fieldWeight in 1025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=1025)
          0.45194978 = weight(abstract_txt:orthographic in 1025) [ClassicSimilarity], result of:
            0.45194978 = score(doc=1025,freq=1.0), product of:
              0.6015066 = queryWeight, product of:
                4.5017037 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.013893246 = queryNorm
              0.751363 = fieldWeight in 1025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.078125 = fieldNorm(doc=1025)
        0.2 = coord(5/25)
    
  3. Whitney , C.; Schiff, L.: ¬The Melvyl Recommender Project : developing library recommendation services (2006) 0.09
    0.08787674 = sum of:
      0.08787674 = product of:
        0.3138455 = sum of:
          0.019306453 = weight(abstract_txt:have in 1173) [ClassicSimilarity], result of:
            0.019306453 = score(doc=1173,freq=3.0), product of:
              0.04452232 = queryWeight, product of:
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.013893246 = queryNorm
              0.43363538 = fieldWeight in 1173, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.07577526 = weight(abstract_txt:patrons in 1173) [ClassicSimilarity], result of:
            0.07577526 = score(doc=1173,freq=2.0), product of:
              0.10065024 = queryWeight, product of:
                1.063172 = boost
                6.8140855 = idf(docFreq=131, maxDocs=44218)
                0.013893246 = queryNorm
              0.7528572 = fieldWeight in 1173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8140855 = idf(docFreq=131, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.075932406 = weight(abstract_txt:spelling in 1173) [ClassicSimilarity], result of:
            0.075932406 = score(doc=1173,freq=1.0), product of:
              0.12698662 = queryWeight, product of:
                1.1941946 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.013893246 = queryNorm
              0.59795594 = fieldWeight in 1173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.028536884 = weight(abstract_txt:systems in 1173) [ClassicSimilarity], result of:
            0.028536884 = score(doc=1173,freq=2.0), product of:
              0.07570211 = queryWeight, product of:
                1.5970213 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.013893246 = queryNorm
              0.37696287 = fieldWeight in 1173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.03359707 = weight(abstract_txt:text in 1173) [ClassicSimilarity], result of:
            0.03359707 = score(doc=1173,freq=1.0), product of:
              0.10634438 = queryWeight, product of:
                1.8928404 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013893246 = queryNorm
              0.3159271 = fieldWeight in 1173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.020389728 = weight(abstract_txt:information in 1173) [ClassicSimilarity], result of:
            0.020389728 = score(doc=1173,freq=2.0), product of:
              0.076229185 = queryWeight, product of:
                2.2663782 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.013893246 = queryNorm
              0.2674793 = fieldWeight in 1173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
          0.06030767 = weight(abstract_txt:retrieval in 1173) [ClassicSimilarity], result of:
            0.06030767 = score(doc=1173,freq=2.0), product of:
              0.15707076 = queryWeight, product of:
                3.2532647 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013893246 = queryNorm
              0.38395226 = fieldWeight in 1173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=1173)
        0.28 = coord(7/25)
    
  4. Taghva, K.; Borsack, J.; Condit, A.: Evaluation of model-based retrieval effectiveness with OCR text (1996) 0.08
    0.07672493 = sum of:
      0.07672493 = product of:
        0.38362464 = sum of:
          0.03424426 = weight(abstract_txt:systems in 4485) [ClassicSimilarity], result of:
            0.03424426 = score(doc=4485,freq=2.0), product of:
              0.07570211 = queryWeight, product of:
                1.5970213 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.013893246 = queryNorm
              0.45235544 = fieldWeight in 4485, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.09375 = fieldNorm(doc=4485)
          0.069830194 = weight(abstract_txt:text in 4485) [ClassicSimilarity], result of:
            0.069830194 = score(doc=4485,freq=3.0), product of:
              0.10634438 = queryWeight, product of:
                1.8928404 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013893246 = queryNorm
              0.6566421 = fieldWeight in 4485, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=4485)
          0.01730126 = weight(abstract_txt:information in 4485) [ClassicSimilarity], result of:
            0.01730126 = score(doc=4485,freq=1.0), product of:
              0.076229185 = queryWeight, product of:
                2.2663782 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.013893246 = queryNorm
              0.22696373 = fieldWeight in 4485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.09375 = fieldNorm(doc=4485)
          0.15990339 = weight(abstract_txt:strings in 4485) [ClassicSimilarity], result of:
            0.15990339 = score(doc=4485,freq=1.0), product of:
              0.23277383 = queryWeight, product of:
                2.2865367 = boost
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.013893246 = queryNorm
              0.68694746 = fieldWeight in 4485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.09375 = fieldNorm(doc=4485)
          0.10234552 = weight(abstract_txt:retrieval in 4485) [ClassicSimilarity], result of:
            0.10234552 = score(doc=4485,freq=4.0), product of:
              0.15707076 = queryWeight, product of:
                3.2532647 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013893246 = queryNorm
              0.6515886 = fieldWeight in 4485, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.09375 = fieldNorm(doc=4485)
        0.2 = coord(5/25)
    
  5. Young, C.W.; Eastman, C.M.; Oakman, R.L.: ¬An analysis of ill-formed input in natural language queries to document retrieval systems (1991) 0.08
    0.07595452 = sum of:
      0.07595452 = product of:
        0.47471577 = sum of:
          0.10738463 = weight(abstract_txt:spelling in 5263) [ClassicSimilarity], result of:
            0.10738463 = score(doc=5263,freq=2.0), product of:
              0.12698662 = queryWeight, product of:
                1.1941946 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.013893246 = queryNorm
              0.8456373 = fieldWeight in 5263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.078125 = fieldNorm(doc=5263)
          0.014417716 = weight(abstract_txt:information in 5263) [ClassicSimilarity], result of:
            0.014417716 = score(doc=5263,freq=1.0), product of:
              0.076229185 = queryWeight, product of:
                2.2663782 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.013893246 = queryNorm
              0.18913643 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=5263)
          0.31026947 = weight(abstract_txt:punctuation in 5263) [ClassicSimilarity], result of:
            0.31026947 = score(doc=5263,freq=2.0), product of:
              0.32456318 = queryWeight, product of:
                2.6999812 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.013893246 = queryNorm
              0.9559602 = fieldWeight in 5263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.078125 = fieldNorm(doc=5263)
          0.042643964 = weight(abstract_txt:retrieval in 5263) [ClassicSimilarity], result of:
            0.042643964 = score(doc=5263,freq=1.0), product of:
              0.15707076 = queryWeight, product of:
                3.2532647 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013893246 = queryNorm
              0.27149525 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=5263)
        0.16 = coord(4/25)