Document (#7819)

Author
Willett, P.
Title
Best-match text retrieval
Source
Library and information briefings. 1993, no.49, S.1-11
Year
1993
Abstract
Provides an introduction to the computational techniques that underlie best match searching retrieval systems. Discusses: problems of traditional Boolean systems; characteristics of best-match searching; automatic indexing; term conflation; matching of documents and queries (dealing with similarity measures, initial weights, relevance weights, and the matching algorithm); and describes operational best-match systems
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Willett, P.: Recent trends in hierarchic document clustering : a critical review (1988) 5.00
    5.0043335 = sum of:
      5.0043335 = weight(author_txt:willett in 2604) [ClassicSimilarity], result of:
        5.0043335 = fieldWeight in 2604, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.006933 = idf(docFreq=37, maxDocs=41962)
          0.625 = fieldNorm(doc=2604)
    
  2. Willett, P.: From chemical documentation to chemoinformatics : 50 years of chemical information science (2009) 5.00
    5.0043335 = sum of:
      5.0043335 = weight(author_txt:willett in 657) [ClassicSimilarity], result of:
        5.0043335 = fieldWeight in 657, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.006933 = idf(docFreq=37, maxDocs=41962)
          0.625 = fieldNorm(doc=657)
    
  3. Perry, R.; Willett, P.: ¬A revies of the use of inverted files for best match searching in information retrieval systems (1983) 4.00
    4.0034666 = sum of:
      4.0034666 = weight(author_txt:willett in 2701) [ClassicSimilarity], result of:
        4.0034666 = fieldWeight in 2701, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.006933 = idf(docFreq=37, maxDocs=41962)
          0.5 = fieldNorm(doc=2701)
    
  4. Robertson, A.M.; Willett, P.: Retrieval techniques for historical English text : searching the sixteenth and seventeenth century titles in the Catalogue of Caterbury Cathedral Library using spelling-correction methods (1992) 4.00
    4.0034666 = sum of:
      4.0034666 = weight(author_txt:willett in 4209) [ClassicSimilarity], result of:
        4.0034666 = fieldWeight in 4209, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.006933 = idf(docFreq=37, maxDocs=41962)
          0.5 = fieldNorm(doc=4209)
    
  5. Shaw, R.J.; Willett, P.: On the non-random nature of nearest-neighbour document clusters (1993) 4.00
    4.0034666 = sum of:
      4.0034666 = weight(author_txt:willett in 5817) [ClassicSimilarity], result of:
        4.0034666 = fieldWeight in 5817, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.006933 = idf(docFreq=37, maxDocs=41962)
          0.5 = fieldNorm(doc=5817)
    

Similar documents (content)

  1. Paris, L.A.H.; Tibbo, H.R.: Freestyle vs. Boolean : a comparison of partial and exact match retrieval systems (1998) 0.33
    0.3301346 = sum of:
      0.3301346 = product of:
        1.1790521 = sum of:
          0.05149033 = weight(abstract_txt:techniques in 4743) [ClassicSimilarity], result of:
            0.05149033 = score(doc=4743,freq=3.0), product of:
              0.07004113 = queryWeight, product of:
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.01547079 = queryNorm
              0.7351442 = fieldWeight in 4743, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.032688044 = weight(abstract_txt:traditional in 4743) [ClassicSimilarity], result of:
            0.032688044 = score(doc=4743,freq=1.0), product of:
              0.07461665 = queryWeight, product of:
                1.0321465 = boost
                4.672851 = idf(docFreq=1065, maxDocs=41962)
                0.01547079 = queryNorm
              0.43807977 = fieldWeight in 4743, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.672851 = idf(docFreq=1065, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.04269254 = weight(abstract_txt:queries in 4743) [ClassicSimilarity], result of:
            0.04269254 = score(doc=4743,freq=1.0), product of:
              0.089154735 = queryWeight, product of:
                1.1282248 = boost
                5.107828 = idf(docFreq=689, maxDocs=41962)
                0.01547079 = queryNorm
              0.4788589 = fieldWeight in 4743, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.107828 = idf(docFreq=689, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.10854261 = weight(abstract_txt:boolean in 4743) [ClassicSimilarity], result of:
            0.10854261 = score(doc=4743,freq=2.0), product of:
              0.13181555 = queryWeight, product of:
                1.3718504 = boost
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.01547079 = queryNorm
              0.8234432 = fieldWeight in 4743, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.037578903 = weight(abstract_txt:retrieval in 4743) [ClassicSimilarity], result of:
            0.037578903 = score(doc=4743,freq=2.0), product of:
              0.08188528 = queryWeight, product of:
                1.5291193 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.01547079 = queryNorm
              0.45892137 = fieldWeight in 4743, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.16667406 = weight(abstract_txt:best in 4743) [ClassicSimilarity], result of:
            0.16667406 = score(doc=4743,freq=1.0), product of:
              0.35089332 = queryWeight, product of:
                4.476525 = boost
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.01547079 = queryNorm
              0.47499925 = fieldWeight in 4743, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
          0.73938555 = weight(abstract_txt:match in 4743) [ClassicSimilarity], result of:
            0.73938555 = score(doc=4743,freq=5.0), product of:
              0.55401355 = queryWeight, product of:
                5.6248846 = boost
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.01547079 = queryNorm
              1.3345983 = fieldWeight in 4743, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.09375 = fieldNorm(doc=4743)
        0.28 = coord(7/25)
    
  2. Loughran, H.: ¬A review of nearest neighbour information retrieval (1994) 0.27
    0.26932335 = sum of:
      0.26932335 = product of:
        0.9618691 = sum of:
          0.04358406 = weight(abstract_txt:traditional in 685) [ClassicSimilarity], result of:
            0.04358406 = score(doc=685,freq=1.0), product of:
              0.07461665 = queryWeight, product of:
                1.0321465 = boost
                4.672851 = idf(docFreq=1065, maxDocs=41962)
                0.01547079 = queryNorm
              0.5841064 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.672851 = idf(docFreq=1065, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.10233497 = weight(abstract_txt:boolean in 685) [ClassicSimilarity], result of:
            0.10233497 = score(doc=685,freq=1.0), product of:
              0.13181555 = queryWeight, product of:
                1.3718504 = boost
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.01547079 = queryNorm
              0.7763497 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.03542973 = weight(abstract_txt:retrieval in 685) [ClassicSimilarity], result of:
            0.03542973 = score(doc=685,freq=1.0), product of:
              0.08188528 = queryWeight, product of:
                1.5291193 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.01547079 = queryNorm
              0.4326752 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.06612761 = weight(abstract_txt:searching in 685) [ClassicSimilarity], result of:
            0.06612761 = score(doc=685,freq=1.0), product of:
              0.12413164 = queryWeight, product of:
                1.8826938 = boost
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.01547079 = queryNorm
              0.53272164 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.051276352 = weight(abstract_txt:systems in 685) [ClassicSimilarity], result of:
            0.051276352 = score(doc=685,freq=1.0), product of:
              0.11993219 = queryWeight, product of:
                2.2664802 = boost
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.01547079 = queryNorm
              0.42754453 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.22223207 = weight(abstract_txt:best in 685) [ClassicSimilarity], result of:
            0.22223207 = score(doc=685,freq=1.0), product of:
              0.35089332 = queryWeight, product of:
                4.476525 = boost
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.01547079 = queryNorm
              0.6333323 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
          0.44088432 = weight(abstract_txt:match in 685) [ClassicSimilarity], result of:
            0.44088432 = score(doc=685,freq=1.0), product of:
              0.55401355 = queryWeight, product of:
                5.6248846 = boost
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.01547079 = queryNorm
              0.7958006 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.125 = fieldNorm(doc=685)
        0.28 = coord(7/25)
    
  3. Ford, N.; Miller, D.; Moss, N.: Web search strategies and approaches to studying (2003) 0.24
    0.23506069 = sum of:
      0.23506069 = product of:
        0.9794196 = sum of:
          0.043134958 = weight(abstract_txt:queries in 168) [ClassicSimilarity], result of:
            0.043134958 = score(doc=168,freq=3.0), product of:
              0.089154735 = queryWeight, product of:
                1.1282248 = boost
                5.107828 = idf(docFreq=689, maxDocs=41962)
                0.01547079 = queryNorm
              0.48382127 = fieldWeight in 168, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.107828 = idf(docFreq=689, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
          0.030118387 = weight(abstract_txt:measures in 168) [ClassicSimilarity], result of:
            0.030118387 = score(doc=168,freq=1.0), product of:
              0.10120137 = queryWeight, product of:
                1.2020336 = boost
                5.441984 = idf(docFreq=493, maxDocs=41962)
                0.01547079 = queryNorm
              0.2976085 = fieldWeight in 168, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.441984 = idf(docFreq=493, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
          0.10966746 = weight(abstract_txt:boolean in 168) [ClassicSimilarity], result of:
            0.10966746 = score(doc=168,freq=6.0), product of:
              0.13181555 = queryWeight, product of:
                1.3718504 = boost
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.01547079 = queryNorm
              0.8319766 = fieldWeight in 168, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.210798 = idf(docFreq=228, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
          0.028930832 = weight(abstract_txt:searching in 168) [ClassicSimilarity], result of:
            0.028930832 = score(doc=168,freq=1.0), product of:
              0.12413164 = queryWeight, product of:
                1.8826938 = boost
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.01547079 = queryNorm
              0.23306572 = fieldWeight in 168, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
          0.25723723 = weight(abstract_txt:best in 168) [ClassicSimilarity], result of:
            0.25723723 = score(doc=168,freq=7.0), product of:
              0.35089332 = queryWeight, product of:
                4.476525 = boost
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.01547079 = queryNorm
              0.73309237 = fieldWeight in 168, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
          0.51033074 = weight(abstract_txt:match in 168) [ClassicSimilarity], result of:
            0.51033074 = score(doc=168,freq=7.0), product of:
              0.55401355 = queryWeight, product of:
                5.6248846 = boost
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.01547079 = queryNorm
              0.92115206 = fieldWeight in 168, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.0546875 = fieldNorm(doc=168)
        0.24 = coord(6/25)
    
  4. Greengrass, M.: Conflation methods for searching databases of Latin text (1996) 0.23
    0.23048328 = sum of:
      0.23048328 = product of:
        0.82315457 = sum of:
          0.03634563 = weight(abstract_txt:term in 57) [ClassicSimilarity], result of:
            0.03634563 = score(doc=57,freq=1.0), product of:
              0.08008378 = queryWeight, product of:
                1.0692905 = boost
                4.8410144 = idf(docFreq=900, maxDocs=41962)
                0.01547079 = queryNorm
              0.45384508 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8410144 = idf(docFreq=900, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.060388368 = weight(abstract_txt:algorithm in 57) [ClassicSimilarity], result of:
            0.060388368 = score(doc=57,freq=1.0), product of:
              0.11234281 = queryWeight, product of:
                1.2664734 = boost
                5.733723 = idf(docFreq=368, maxDocs=41962)
                0.01547079 = queryNorm
              0.53753656 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.733723 = idf(docFreq=368, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.037578903 = weight(abstract_txt:retrieval in 57) [ClassicSimilarity], result of:
            0.037578903 = score(doc=57,freq=2.0), product of:
              0.08188528 = queryWeight, product of:
                1.5291193 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.01547079 = queryNorm
              0.45892137 = fieldWeight in 57, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.04959571 = weight(abstract_txt:searching in 57) [ClassicSimilarity], result of:
            0.04959571 = score(doc=57,freq=1.0), product of:
              0.12413164 = queryWeight, product of:
                1.8826938 = boost
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.01547079 = queryNorm
              0.39954123 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.27012548 = weight(abstract_txt:conflation in 57) [ClassicSimilarity], result of:
            0.27012548 = score(doc=57,freq=1.0), product of:
              0.30499086 = queryWeight, product of:
                2.0867329 = boost
                9.447295 = idf(docFreq=8, maxDocs=41962)
                0.01547079 = queryNorm
              0.8856839 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.447295 = idf(docFreq=8, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.038457263 = weight(abstract_txt:systems in 57) [ClassicSimilarity], result of:
            0.038457263 = score(doc=57,freq=1.0), product of:
              0.11993219 = queryWeight, product of:
                2.2664802 = boost
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.01547079 = queryNorm
              0.3206584 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
          0.33066323 = weight(abstract_txt:match in 57) [ClassicSimilarity], result of:
            0.33066323 = score(doc=57,freq=1.0), product of:
              0.55401355 = queryWeight, product of:
                5.6248846 = boost
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.01547079 = queryNorm
              0.59685045 = fieldWeight in 57, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.09375 = fieldNorm(doc=57)
        0.28 = coord(7/25)
    
  5. Hancock-Beaulieu, M.; Walker, S.: ¬An evaluation of automatic query expansion in an online library catalogue (1992) 0.23
    0.22762132 = sum of:
      0.22762132 = product of:
        0.9484222 = sum of:
          0.038913444 = weight(abstract_txt:relevance in 4145) [ClassicSimilarity], result of:
            0.038913444 = score(doc=4145,freq=1.0), product of:
              0.08381265 = queryWeight, product of:
                1.0939015 = boost
                4.952436 = idf(docFreq=805, maxDocs=41962)
                0.01547079 = queryNorm
              0.46429086 = fieldWeight in 4145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.952436 = idf(docFreq=805, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
          0.06360186 = weight(abstract_txt:automatic in 4145) [ClassicSimilarity], result of:
            0.06360186 = score(doc=4145,freq=2.0), product of:
              0.0923024 = queryWeight, product of:
                1.1479684 = boost
                5.1972136 = idf(docFreq=630, maxDocs=41962)
                0.01547079 = queryNorm
              0.6890597 = fieldWeight in 4145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1972136 = idf(docFreq=630, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
          0.09297002 = weight(abstract_txt:operational in 4145) [ClassicSimilarity], result of:
            0.09297002 = score(doc=4145,freq=1.0), product of:
              0.14978617 = queryWeight, product of:
                1.4623768 = boost
                6.6206393 = idf(docFreq=151, maxDocs=41962)
                0.01547079 = queryNorm
              0.6206849 = fieldWeight in 4145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6206393 = idf(docFreq=151, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
          0.04959571 = weight(abstract_txt:searching in 4145) [ClassicSimilarity], result of:
            0.04959571 = score(doc=4145,freq=1.0), product of:
              0.12413164 = queryWeight, product of:
                1.8826938 = boost
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.01547079 = queryNorm
              0.39954123 = fieldWeight in 4145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.261773 = idf(docFreq=1607, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
          0.23571272 = weight(abstract_txt:best in 4145) [ClassicSimilarity], result of:
            0.23571272 = score(doc=4145,freq=2.0), product of:
              0.35089332 = queryWeight, product of:
                4.476525 = boost
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.01547079 = queryNorm
              0.67175037 = fieldWeight in 4145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0666585 = idf(docFreq=718, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
          0.46762845 = weight(abstract_txt:match in 4145) [ClassicSimilarity], result of:
            0.46762845 = score(doc=4145,freq=2.0), product of:
              0.55401355 = queryWeight, product of:
                5.6248846 = boost
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.01547079 = queryNorm
              0.844074 = fieldWeight in 4145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.366405 = idf(docFreq=195, maxDocs=41962)
                0.09375 = fieldNorm(doc=4145)
        0.24 = coord(6/25)