Document (#23335)

Author
Wollf, J.G.
Title
¬A scalable technique for best-match retrieval of sequential information using metrics-guided search
Source
Journal of information science. 20(1994) no.1, S.16-28
Year
1994
Abstract
Describes a new technique for retrieving information by finding the best match or matches between a textual query and a textual database. The technique uses principles of beam search with a measure of probability to guide the search and prune the search tree. Unlike many methods for comparing strings, the method gives a set of alternative matches, graded by the quality of the matching. The new technique is embodies in a software simulation SP21 which runs on a conventional computer. Presnts examples showing best-match retrieval of information from a textual database. Presents analytic and emprirical evidence on the performance of the technique. It lends itself well to parallel processing. Discusses planned developments
Theme
Retrievalalgorithmen

Similar documents (content)

  1. Loughran, H.: ¬A review of nearest neighbour information retrieval (1994) 0.20
    0.19742936 = sum of:
      0.19742936 = product of:
        0.82262236 = sum of:
          0.029368332 = weight(abstract_txt:retrieval in 616) [ClassicSimilarity], result of:
            0.029368332 = score(doc=616,freq=1.0), product of:
              0.067607835 = queryWeight, product of:
                1.0649477 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018268231 = queryNorm
              0.43439242 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
          0.014893932 = weight(abstract_txt:information in 616) [ClassicSimilarity], result of:
            0.014893932 = score(doc=616,freq=1.0), product of:
              0.049216893 = queryWeight, product of:
                1.1128395 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018268231 = queryNorm
              0.3026183 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
          0.06850816 = weight(abstract_txt:search in 616) [ClassicSimilarity], result of:
            0.06850816 = score(doc=616,freq=1.0), product of:
              0.14982435 = queryWeight, product of:
                2.242002 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018268231 = queryNorm
              0.45725656 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
          0.13247529 = weight(abstract_txt:best in 616) [ClassicSimilarity], result of:
            0.13247529 = score(doc=616,freq=1.0), product of:
              0.21128298 = queryWeight, product of:
                2.3057258 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.018268231 = queryNorm
              0.6270041 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
          0.27180967 = weight(abstract_txt:match in 616) [ClassicSimilarity], result of:
            0.27180967 = score(doc=616,freq=1.0), product of:
              0.34115458 = queryWeight, product of:
                2.929888 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.018268231 = queryNorm
              0.79673463 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
          0.30556697 = weight(abstract_txt:technique in 616) [ClassicSimilarity], result of:
            0.30556697 = score(doc=616,freq=1.0), product of:
              0.43731576 = queryWeight, product of:
                4.2825 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.018268231 = queryNorm
              0.69873303 = fieldWeight in 616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.125 = fieldNorm(doc=616)
        0.24 = coord(6/25)
    
  2. Sakai, T.: On the reliability of information retrieval metrics based on graded relevance (2007) 0.17
    0.16878583 = sum of:
      0.16878583 = product of:
        0.7032743 = sum of:
          0.14237764 = weight(abstract_txt:metrics in 910) [ClassicSimilarity], result of:
            0.14237764 = score(doc=910,freq=5.0), product of:
              0.122966096 = queryWeight, product of:
                1.0155644 = boost
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.018268231 = queryNorm
              1.157861 = fieldWeight in 910, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
          0.018355206 = weight(abstract_txt:retrieval in 910) [ClassicSimilarity], result of:
            0.018355206 = score(doc=910,freq=1.0), product of:
              0.067607835 = queryWeight, product of:
                1.0649477 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018268231 = queryNorm
              0.27149525 = fieldWeight in 910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
          0.009308707 = weight(abstract_txt:information in 910) [ClassicSimilarity], result of:
            0.009308707 = score(doc=910,freq=1.0), product of:
              0.049216893 = queryWeight, product of:
                1.1128395 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018268231 = queryNorm
              0.18913643 = fieldWeight in 910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
          0.10652372 = weight(abstract_txt:runs in 910) [ClassicSimilarity], result of:
            0.10652372 = score(doc=910,freq=1.0), product of:
              0.17329195 = queryWeight, product of:
                1.205602 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.018268231 = queryNorm
              0.6147067 = fieldWeight in 910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
          0.28330034 = weight(abstract_txt:graded in 910) [ClassicSimilarity], result of:
            0.28330034 = score(doc=910,freq=4.0), product of:
              0.2095522 = queryWeight, product of:
                1.3257477 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.018268231 = queryNorm
              1.351932 = fieldWeight in 910, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
          0.14340872 = weight(abstract_txt:best in 910) [ClassicSimilarity], result of:
            0.14340872 = score(doc=910,freq=3.0), product of:
              0.21128298 = queryWeight, product of:
                2.3057258 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.018268231 = queryNorm
              0.6787518 = fieldWeight in 910, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.078125 = fieldNorm(doc=910)
        0.24 = coord(6/25)
    
  3. Sormunen, E.; Kekäläinen, J.; Koivisto, J.; Järvelin, K.: Document text characteristics affect the ranking of the most relevant documents by expanded structured queries (2001) 0.15
    0.1502243 = sum of:
      0.1502243 = product of:
        0.53651536 = sum of:
          0.020766545 = weight(abstract_txt:retrieval in 4487) [ClassicSimilarity], result of:
            0.020766545 = score(doc=4487,freq=2.0), product of:
              0.067607835 = queryWeight, product of:
                1.0649477 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018268231 = queryNorm
              0.3071618 = fieldWeight in 4487, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.014893932 = weight(abstract_txt:information in 4487) [ClassicSimilarity], result of:
            0.014893932 = score(doc=4487,freq=4.0), product of:
              0.049216893 = queryWeight, product of:
                1.1128395 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018268231 = queryNorm
              0.3026183 = fieldWeight in 4487, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.03425408 = weight(abstract_txt:search in 4487) [ClassicSimilarity], result of:
            0.03425408 = score(doc=4487,freq=1.0), product of:
              0.14982435 = queryWeight, product of:
                2.242002 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018268231 = queryNorm
              0.22862828 = fieldWeight in 4487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.06623764 = weight(abstract_txt:best in 4487) [ClassicSimilarity], result of:
            0.06623764 = score(doc=4487,freq=1.0), product of:
              0.21128298 = queryWeight, product of:
                2.3057258 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.018268231 = queryNorm
              0.31350204 = fieldWeight in 4487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.111674875 = weight(abstract_txt:textual in 4487) [ClassicSimilarity], result of:
            0.111674875 = score(doc=4487,freq=1.0), product of:
              0.29929417 = queryWeight, product of:
                2.7442555 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.018268231 = queryNorm
              0.37312746 = fieldWeight in 4487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.13590483 = weight(abstract_txt:match in 4487) [ClassicSimilarity], result of:
            0.13590483 = score(doc=4487,freq=1.0), product of:
              0.34115458 = queryWeight, product of:
                2.929888 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.018268231 = queryNorm
              0.39836732 = fieldWeight in 4487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
          0.15278348 = weight(abstract_txt:technique in 4487) [ClassicSimilarity], result of:
            0.15278348 = score(doc=4487,freq=1.0), product of:
              0.43731576 = queryWeight, product of:
                4.2825 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.018268231 = queryNorm
              0.34936652 = fieldWeight in 4487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.0625 = fieldNorm(doc=4487)
        0.28 = coord(7/25)
    
  4. Hoad, T.C.; Zobel, J.: Methods for identifying versioned and plagiarized documents (2003) 0.12
    0.12220059 = sum of:
      0.12220059 = product of:
        0.50916916 = sum of:
          0.014684166 = weight(abstract_txt:retrieval in 5159) [ClassicSimilarity], result of:
            0.014684166 = score(doc=5159,freq=1.0), product of:
              0.067607835 = queryWeight, product of:
                1.0649477 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018268231 = queryNorm
              0.21719621 = fieldWeight in 5159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
          0.007446966 = weight(abstract_txt:information in 5159) [ClassicSimilarity], result of:
            0.007446966 = score(doc=5159,freq=1.0), product of:
              0.049216893 = queryWeight, product of:
                1.1128395 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018268231 = queryNorm
              0.15130915 = fieldWeight in 5159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
          0.06882708 = weight(abstract_txt:strings in 5159) [ClassicSimilarity], result of:
            0.06882708 = score(doc=5159,freq=1.0), product of:
              0.15028897 = queryWeight, product of:
                1.1227378 = boost
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.018268231 = queryNorm
              0.45796496 = fieldWeight in 5159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
          0.06623764 = weight(abstract_txt:best in 5159) [ClassicSimilarity], result of:
            0.06623764 = score(doc=5159,freq=1.0), product of:
              0.21128298 = queryWeight, product of:
                2.3057258 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.018268231 = queryNorm
              0.31350204 = fieldWeight in 5159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
          0.13590483 = weight(abstract_txt:match in 5159) [ClassicSimilarity], result of:
            0.13590483 = score(doc=5159,freq=1.0), product of:
              0.34115458 = queryWeight, product of:
                2.929888 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.018268231 = queryNorm
              0.39836732 = fieldWeight in 5159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
          0.21606846 = weight(abstract_txt:technique in 5159) [ClassicSimilarity], result of:
            0.21606846 = score(doc=5159,freq=2.0), product of:
              0.43731576 = queryWeight, product of:
                4.2825 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.018268231 = queryNorm
              0.49407884 = fieldWeight in 5159, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.0625 = fieldNorm(doc=5159)
        0.24 = coord(6/25)
    
  5. He, W.; Erdelez, S.; Wang, F.-K.; Shyu, C.-R.: ¬The effects of conceptual description and search practice on users' mental models and information seeking in a case-based reasoning retrieval system (2008) 0.12
    0.121676974 = sum of:
      0.121676974 = product of:
        0.60838485 = sum of:
          0.020766545 = weight(abstract_txt:retrieval in 2036) [ClassicSimilarity], result of:
            0.020766545 = score(doc=2036,freq=2.0), product of:
              0.067607835 = queryWeight, product of:
                1.0649477 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018268231 = queryNorm
              0.3071618 = fieldWeight in 2036, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=2036)
          0.007446966 = weight(abstract_txt:information in 2036) [ClassicSimilarity], result of:
            0.007446966 = score(doc=2036,freq=1.0), product of:
              0.049216893 = queryWeight, product of:
                1.1128395 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018268231 = queryNorm
              0.15130915 = fieldWeight in 2036, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2036)
          0.12816705 = weight(abstract_txt:search in 2036) [ClassicSimilarity], result of:
            0.12816705 = score(doc=2036,freq=14.0), product of:
              0.14982435 = queryWeight, product of:
                2.242002 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018268231 = queryNorm
              0.8554487 = fieldWeight in 2036, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=2036)
          0.14811188 = weight(abstract_txt:best in 2036) [ClassicSimilarity], result of:
            0.14811188 = score(doc=2036,freq=5.0), product of:
              0.21128298 = queryWeight, product of:
                2.3057258 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.018268231 = queryNorm
              0.7010119 = fieldWeight in 2036, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.0625 = fieldNorm(doc=2036)
          0.30389243 = weight(abstract_txt:match in 2036) [ClassicSimilarity], result of:
            0.30389243 = score(doc=2036,freq=5.0), product of:
              0.34115458 = queryWeight, product of:
                2.929888 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.018268231 = queryNorm
              0.8907764 = fieldWeight in 2036, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.0625 = fieldNorm(doc=2036)
        0.2 = coord(5/25)