Document (#6276)

Author
Uratani, N.
Takeda, M.
Title
¬A fast string-searching algorithm for multiple patterns
Source
Information processing and management. 29(1993) no.6, S.775-791
Year
1993
Abstract
The string-searching problem is to find all occurrences of pattern(s) in a text string. The Aho-Corasick string searching algorithm simultaneously finds all occurrences of multiple patterns in one pass through the text. The Boyer-Moore algorithm is the fastest algorithm for a single pattern. By combining the ideas of these two algorithms, presents an efficient string searching algorithm for multiple patterns. The algorithm runs in sublinear time, on the average, as the BM algorithm achieves, and its preprocessing time is linear proportional to the sum of the lengths of the patterns like the AC algorithm
Theme
Retrievalalgorithmen

Similar documents (content)

  1. Baeza-Yates, R.A.: String searching algorithms (1992) 0.19
    0.19100295 = sum of:
      0.19100295 = product of:
        1.1937685 = sum of:
          0.041772597 = weight(abstract_txt:text in 3505) [ClassicSimilarity], result of:
            0.041772597 = score(doc=3505,freq=1.0), product of:
              0.06611113 = queryWeight, product of:
                1.398443 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011690497 = queryNorm
              0.6318542 = fieldWeight in 3505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.15625 = fieldNorm(doc=3505)
          0.09938033 = weight(abstract_txt:searching in 3505) [ClassicSimilarity], result of:
            0.09938033 = score(doc=3505,freq=1.0), product of:
              0.14844215 = queryWeight, product of:
                2.9634738 = boost
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.011690497 = queryNorm
              0.6694886 = fieldWeight in 3505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.15625 = fieldNorm(doc=3505)
          0.58334374 = weight(abstract_txt:string in 3505) [ClassicSimilarity], result of:
            0.58334374 = score(doc=3505,freq=1.0), product of:
              0.5203273 = queryWeight, product of:
                6.2032003 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.011690497 = queryNorm
              1.1211091 = fieldWeight in 3505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.15625 = fieldNorm(doc=3505)
          0.4692718 = weight(abstract_txt:algorithm in 3505) [ClassicSimilarity], result of:
            0.4692718 = score(doc=3505,freq=1.0), product of:
              0.52640086 = queryWeight, product of:
                7.892158 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.011690497 = queryNorm
              0.89147234 = fieldWeight in 3505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.15625 = fieldNorm(doc=3505)
        0.16 = coord(4/25)
    
  2. Cui, H.; Boufford, D.; Selden, P.: Semantic annotation of biosystematics literature without training examples (2010) 0.16
    0.1603329 = sum of:
      0.1603329 = product of:
        0.6680538 = sum of:
          0.037651632 = weight(abstract_txt:linear in 3422) [ClassicSimilarity], result of:
            0.037651632 = score(doc=3422,freq=1.0), product of:
              0.090188846 = queryWeight, product of:
                1.1549652 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.011690497 = queryNorm
              0.41747546 = fieldWeight in 3422, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
          0.061541017 = weight(abstract_txt:runs in 3422) [ClassicSimilarity], result of:
            0.061541017 = score(doc=3422,freq=1.0), product of:
              0.12514305 = queryWeight, product of:
                1.3604915 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.011690497 = queryNorm
              0.49176535 = fieldWeight in 3422, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
          0.016709037 = weight(abstract_txt:text in 3422) [ClassicSimilarity], result of:
            0.016709037 = score(doc=3422,freq=1.0), product of:
              0.06611113 = queryWeight, product of:
                1.398443 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011690497 = queryNorm
              0.25274166 = fieldWeight in 3422, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
          0.018037703 = weight(abstract_txt:time in 3422) [ClassicSimilarity], result of:
            0.018037703 = score(doc=3422,freq=1.0), product of:
              0.06957093 = queryWeight, product of:
                1.4345688 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.011690497 = queryNorm
              0.2592707 = fieldWeight in 3422, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
          0.074323826 = weight(abstract_txt:patterns in 3422) [ClassicSimilarity], result of:
            0.074323826 = score(doc=3422,freq=1.0), product of:
              0.22528677 = queryWeight, product of:
                3.650819 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.011690497 = queryNorm
              0.32990766 = fieldWeight in 3422, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
          0.4597906 = weight(abstract_txt:algorithm in 3422) [ClassicSimilarity], result of:
            0.4597906 = score(doc=3422,freq=6.0), product of:
              0.52640086 = queryWeight, product of:
                7.892158 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.011690497 = queryNorm
              0.87346095 = fieldWeight in 3422, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=3422)
        0.24 = coord(6/25)
    
  3. Ackermann, J.: Knuth-Morris-Pratt (2005) 0.15
    0.15317987 = sum of:
      0.15317987 = product of:
        0.63824946 = sum of:
          0.021383813 = weight(abstract_txt:fast in 865) [ClassicSimilarity], result of:
            0.021383813 = score(doc=865,freq=1.0), product of:
              0.06761064 = queryWeight, product of:
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.011690497 = queryNorm
              0.31627882 = fieldWeight in 865, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
          0.0146204075 = weight(abstract_txt:text in 865) [ClassicSimilarity], result of:
            0.0146204075 = score(doc=865,freq=1.0), product of:
              0.06611113 = queryWeight, product of:
                1.398443 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011690497 = queryNorm
              0.22114895 = fieldWeight in 865, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
          0.0983373 = weight(abstract_txt:moore in 865) [ClassicSimilarity], result of:
            0.0983373 = score(doc=865,freq=1.0), product of:
              0.18696937 = queryWeight, product of:
                1.6629443 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.011690497 = queryNorm
              0.52595407 = fieldWeight in 865, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
          0.10742843 = weight(abstract_txt:boyer in 865) [ClassicSimilarity], result of:
            0.10742843 = score(doc=865,freq=1.0), product of:
              0.19832209 = queryWeight, product of:
                1.712687 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.011690497 = queryNorm
              0.54168665 = fieldWeight in 865, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
          0.1077391 = weight(abstract_txt:pattern in 865) [ClassicSimilarity], result of:
            0.1077391 = score(doc=865,freq=4.0), product of:
              0.15771167 = queryWeight, product of:
                2.1599286 = boost
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.011690497 = queryNorm
              0.6831397 = fieldWeight in 865, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
          0.2887404 = weight(abstract_txt:string in 865) [ClassicSimilarity], result of:
            0.2887404 = score(doc=865,freq=2.0), product of:
              0.5203273 = queryWeight, product of:
                6.2032003 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.011690497 = queryNorm
              0.5549207 = fieldWeight in 865, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.0546875 = fieldNorm(doc=865)
        0.24 = coord(6/25)
    
  4. Mustafa, S.H.: Word-oriented approximate string matching using occurrence heuristic tables : a heuristic for searching Arabic text (2005) 0.11
    0.11110415 = sum of:
      0.11110415 = product of:
        0.5555208 = sum of:
          0.020886298 = weight(abstract_txt:text in 1715) [ClassicSimilarity], result of:
            0.020886298 = score(doc=1715,freq=1.0), product of:
              0.06611113 = queryWeight, product of:
                1.398443 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011690497 = queryNorm
              0.3159271 = fieldWeight in 1715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1715)
          0.12903915 = weight(abstract_txt:occurrences in 1715) [ClassicSimilarity], result of:
            0.12903915 = score(doc=1715,freq=1.0), product of:
              0.22259462 = queryWeight, product of:
                2.5660481 = boost
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.011690497 = queryNorm
              0.57970476 = fieldWeight in 1715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.078125 = fieldNorm(doc=1715)
          0.06423324 = weight(abstract_txt:multiple in 1715) [ClassicSimilarity], result of:
            0.06423324 = score(doc=1715,freq=1.0), product of:
              0.16004322 = queryWeight, product of:
                2.6648438 = boost
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.011690497 = queryNorm
              0.40134937 = fieldWeight in 1715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.078125 = fieldNorm(doc=1715)
          0.049690165 = weight(abstract_txt:searching in 1715) [ClassicSimilarity], result of:
            0.049690165 = score(doc=1715,freq=1.0), product of:
              0.14844215 = queryWeight, product of:
                2.9634738 = boost
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.011690497 = queryNorm
              0.3347443 = fieldWeight in 1715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.078125 = fieldNorm(doc=1715)
          0.29167187 = weight(abstract_txt:string in 1715) [ClassicSimilarity], result of:
            0.29167187 = score(doc=1715,freq=1.0), product of:
              0.5203273 = queryWeight, product of:
                6.2032003 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.011690497 = queryNorm
              0.56055456 = fieldWeight in 1715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.078125 = fieldNorm(doc=1715)
        0.2 = coord(5/25)
    
  5. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.11
    0.10629363 = sum of:
      0.10629363 = product of:
        0.53146815 = sum of:
          0.052817665 = weight(abstract_txt:achieves in 3927) [ClassicSimilarity], result of:
            0.052817665 = score(doc=3927,freq=1.0), product of:
              0.11301856 = queryWeight, product of:
                1.2929072 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.011690497 = queryNorm
              0.4673362 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.023630148 = weight(abstract_txt:text in 3927) [ClassicSimilarity], result of:
            0.023630148 = score(doc=3927,freq=2.0), product of:
              0.06611113 = queryWeight, product of:
                1.398443 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011690497 = queryNorm
              0.3574307 = fieldWeight in 3927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.018037703 = weight(abstract_txt:time in 3927) [ClassicSimilarity], result of:
            0.018037703 = score(doc=3927,freq=1.0), product of:
              0.06957093 = queryWeight, product of:
                1.4345688 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.011690497 = queryNorm
              0.2592707 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.0615652 = weight(abstract_txt:pattern in 3927) [ClassicSimilarity], result of:
            0.0615652 = score(doc=3927,freq=1.0), product of:
              0.15771167 = queryWeight, product of:
                2.1599286 = boost
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.011690497 = queryNorm
              0.39036554 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.37541744 = weight(abstract_txt:algorithm in 3927) [ClassicSimilarity], result of:
            0.37541744 = score(doc=3927,freq=4.0), product of:
              0.52640086 = queryWeight, product of:
                7.892158 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.011690497 = queryNorm
              0.71317786 = fieldWeight in 3927, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
        0.2 = coord(5/25)