Document (#6276)

Author
Uratani, N.
Takeda, M.
Title
¬A fast string-searching algorithm for multiple patterns
Source
Information processing and management. 29(1993) no.6, S.775-791
Year
1993
Abstract
The string-searching problem is to find all occurrences of pattern(s) in a text string. The Aho-Corasick string searching algorithm simultaneously finds all occurrences of multiple patterns in one pass through the text. The Boyer-Moore algorithm is the fastest algorithm for a single pattern. By combining the ideas of these two algorithms, presents an efficient string searching algorithm for multiple patterns. The algorithm runs in sublinear time, on the average, as the BM algorithm achieves, and its preprocessing time is linear proportional to the sum of the lengths of the patterns like the AC algorithm
Theme
Retrievalalgorithmen

Similar documents (content)

  1. Baeza-Yates, R.A.: String searching algorithms (1992) 0.19
    0.1908321 = sum of:
      0.1908321 = product of:
        1.1927006 = sum of:
          0.04191005 = weight(abstract_txt:text in 4506) [ClassicSimilarity], result of:
            0.04191005 = score(doc=4506,freq=1.0), product of:
              0.06619911 = queryWeight, product of:
                1.3991206 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.01167753 = queryNorm
              0.63309085 = fieldWeight in 4506, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.15625 = fieldNorm(doc=4506)
          0.097887486 = weight(abstract_txt:searching in 4506) [ClassicSimilarity], result of:
            0.097887486 = score(doc=4506,freq=1.0), product of:
              0.14682549 = queryWeight, product of:
                2.9467602 = boost
                4.2668333 = idf(docFreq=1612, maxDocs=42306)
                0.01167753 = queryNorm
              0.66669273 = fieldWeight in 4506, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2668333 = idf(docFreq=1612, maxDocs=42306)
                0.15625 = fieldNorm(doc=4506)
          0.579156 = weight(abstract_txt:string in 4506) [ClassicSimilarity], result of:
            0.579156 = score(doc=4506,freq=1.0), product of:
              0.5173888 = queryWeight, product of:
                6.1845427 = boost
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.01167753 = queryNorm
              1.1193825 = fieldWeight in 4506, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.15625 = fieldNorm(doc=4506)
          0.4737471 = weight(abstract_txt:algorithm in 4506) [ClassicSimilarity], result of:
            0.4737471 = score(doc=4506,freq=1.0), product of:
              0.5292868 = queryWeight, product of:
                7.9123335 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01167753 = queryNorm
              0.8950669 = fieldWeight in 4506, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.15625 = fieldNorm(doc=4506)
        0.16 = coord(4/25)
    
  2. Cui, H.; Boufford, D.; Selden, P.: Semantic annotation of biosystematics literature without training examples (2010) 0.16
    0.16181535 = sum of:
      0.16181535 = product of:
        0.67423064 = sum of:
          0.03858295 = weight(abstract_txt:linear in 423) [ClassicSimilarity], result of:
            0.03858295 = score(doc=423,freq=1.0), product of:
              0.09159118 = queryWeight, product of:
                1.1636996 = boost
                6.7400293 = idf(docFreq=135, maxDocs=42306)
                0.01167753 = queryNorm
              0.42125183 = fieldWeight in 423, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7400293 = idf(docFreq=135, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
          0.060353678 = weight(abstract_txt:runs in 423) [ClassicSimilarity], result of:
            0.060353678 = score(doc=423,freq=1.0), product of:
              0.12342198 = queryWeight, product of:
                1.35086 = boost
                7.824043 = idf(docFreq=45, maxDocs=42306)
                0.01167753 = queryNorm
              0.48900267 = fieldWeight in 423, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.824043 = idf(docFreq=45, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
          0.01676402 = weight(abstract_txt:text in 423) [ClassicSimilarity], result of:
            0.01676402 = score(doc=423,freq=1.0), product of:
              0.06619911 = queryWeight, product of:
                1.3991206 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.01167753 = queryNorm
              0.25323635 = fieldWeight in 423, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
          0.018334031 = weight(abstract_txt:time in 423) [ClassicSimilarity], result of:
            0.018334031 = score(doc=423,freq=1.0), product of:
              0.07027033 = queryWeight, product of:
                1.4415014 = boost
                4.1745143 = idf(docFreq=1768, maxDocs=42306)
                0.01167753 = queryNorm
              0.26090714 = fieldWeight in 423, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1745143 = idf(docFreq=1768, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
          0.07602046 = weight(abstract_txt:patterns in 423) [ClassicSimilarity], result of:
            0.07602046 = score(doc=423,freq=1.0), product of:
              0.22850569 = queryWeight, product of:
                3.6761444 = boost
                5.322963 = idf(docFreq=560, maxDocs=42306)
                0.01167753 = queryNorm
              0.3326852 = fieldWeight in 423, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.322963 = idf(docFreq=560, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
          0.4641755 = weight(abstract_txt:algorithm in 423) [ClassicSimilarity], result of:
            0.4641755 = score(doc=423,freq=6.0), product of:
              0.5292868 = queryWeight, product of:
                7.9123335 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01167753 = queryNorm
              0.8769829 = fieldWeight in 423, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=423)
        0.24 = coord(6/25)
    
  3. Ackermann, J.: Knuth-Morris-Pratt (2005) 0.15
    0.15203641 = sum of:
      0.15203641 = product of:
        0.6334851 = sum of:
          0.02177829 = weight(abstract_txt:fast in 1991) [ClassicSimilarity], result of:
            0.02177829 = score(doc=1991,freq=1.0), product of:
              0.06838074 = queryWeight, product of:
                1.0054975 = boost
                5.8237386 = idf(docFreq=339, maxDocs=42306)
                0.01167753 = queryNorm
              0.3184857 = fieldWeight in 1991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8237386 = idf(docFreq=339, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
          0.014668519 = weight(abstract_txt:text in 1991) [ClassicSimilarity], result of:
            0.014668519 = score(doc=1991,freq=1.0), product of:
              0.06619911 = queryWeight, product of:
                1.3991206 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.01167753 = queryNorm
              0.22158182 = fieldWeight in 1991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
          0.09673759 = weight(abstract_txt:moore in 1991) [ClassicSimilarity], result of:
            0.09673759 = score(doc=1991,freq=1.0), product of:
              0.18477711 = queryWeight, product of:
                1.6528679 = boost
                9.573242 = idf(docFreq=7, maxDocs=42306)
                0.01167753 = queryNorm
              0.5235367 = fieldWeight in 1991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.573242 = idf(docFreq=7, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
          0.1057234 = weight(abstract_txt:boyer in 1991) [ClassicSimilarity], result of:
            0.1057234 = score(doc=1991,freq=1.0), product of:
              0.19604935 = queryWeight, product of:
                1.7025378 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.01167753 = queryNorm
              0.5392693 = fieldWeight in 1991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
          0.10790967 = weight(abstract_txt:pattern in 1991) [ClassicSimilarity], result of:
            0.10790967 = score(doc=1991,freq=4.0), product of:
              0.15774232 = queryWeight, product of:
                2.1597486 = boost
                6.2545214 = idf(docFreq=220, maxDocs=42306)
                0.01167753 = queryNorm
              0.6840883 = fieldWeight in 1991, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2545214 = idf(docFreq=220, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
          0.2866676 = weight(abstract_txt:string in 1991) [ClassicSimilarity], result of:
            0.2866676 = score(doc=1991,freq=2.0), product of:
              0.5173888 = queryWeight, product of:
                6.1845427 = boost
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.01167753 = queryNorm
              0.55406606 = fieldWeight in 1991, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.0546875 = fieldNorm(doc=1991)
        0.24 = coord(6/25)
    
  4. Mustafa, S.H.: Word-oriented approximate string matching using occurrence heuristic tables : a heuristic for searching Arabic text (2005) 0.11
    0.11081717 = sum of:
      0.11081717 = product of:
        0.55408585 = sum of:
          0.020955024 = weight(abstract_txt:text in 2716) [ClassicSimilarity], result of:
            0.020955024 = score(doc=2716,freq=1.0), product of:
              0.06619911 = queryWeight, product of:
                1.3991206 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.01167753 = queryNorm
              0.31654543 = fieldWeight in 2716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.078125 = fieldNorm(doc=2716)
          0.12862104 = weight(abstract_txt:occurrences in 2716) [ClassicSimilarity], result of:
            0.12862104 = score(doc=2716,freq=1.0), product of:
              0.22192252 = queryWeight, product of:
                2.5617087 = boost
                7.4185777 = idf(docFreq=68, maxDocs=42306)
                0.01167753 = queryNorm
              0.5795764 = fieldWeight in 2716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4185777 = idf(docFreq=68, maxDocs=42306)
                0.078125 = fieldNorm(doc=2716)
          0.06598806 = weight(abstract_txt:multiple in 2716) [ClassicSimilarity], result of:
            0.06598806 = score(doc=2716,freq=1.0), product of:
              0.16280484 = queryWeight, product of:
                2.6872518 = boost
                5.188096 = idf(docFreq=641, maxDocs=42306)
                0.01167753 = queryNorm
              0.40532 = fieldWeight in 2716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188096 = idf(docFreq=641, maxDocs=42306)
                0.078125 = fieldNorm(doc=2716)
          0.048943743 = weight(abstract_txt:searching in 2716) [ClassicSimilarity], result of:
            0.048943743 = score(doc=2716,freq=1.0), product of:
              0.14682549 = queryWeight, product of:
                2.9467602 = boost
                4.2668333 = idf(docFreq=1612, maxDocs=42306)
                0.01167753 = queryNorm
              0.33334637 = fieldWeight in 2716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2668333 = idf(docFreq=1612, maxDocs=42306)
                0.078125 = fieldNorm(doc=2716)
          0.289578 = weight(abstract_txt:string in 2716) [ClassicSimilarity], result of:
            0.289578 = score(doc=2716,freq=1.0), product of:
              0.5173888 = queryWeight, product of:
                6.1845427 = boost
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.01167753 = queryNorm
              0.55969125 = fieldWeight in 2716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1640477 = idf(docFreq=88, maxDocs=42306)
                0.078125 = fieldNorm(doc=2716)
        0.2 = coord(5/25)
    
  5. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.11
    0.10721334 = sum of:
      0.10721334 = product of:
        0.5360667 = sum of:
          0.05336442 = weight(abstract_txt:achieves in 846) [ClassicSimilarity], result of:
            0.05336442 = score(doc=846,freq=1.0), product of:
              0.113699324 = queryWeight, product of:
                1.2965611 = boost
                7.5095496 = idf(docFreq=62, maxDocs=42306)
                0.01167753 = queryNorm
              0.46934685 = fieldWeight in 846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5095496 = idf(docFreq=62, maxDocs=42306)
                0.0625 = fieldNorm(doc=846)
          0.023707906 = weight(abstract_txt:text in 846) [ClassicSimilarity], result of:
            0.023707906 = score(doc=846,freq=2.0), product of:
              0.06619911 = queryWeight, product of:
                1.3991206 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.01167753 = queryNorm
              0.35813028 = fieldWeight in 846, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0625 = fieldNorm(doc=846)
          0.018334031 = weight(abstract_txt:time in 846) [ClassicSimilarity], result of:
            0.018334031 = score(doc=846,freq=1.0), product of:
              0.07027033 = queryWeight, product of:
                1.4415014 = boost
                4.1745143 = idf(docFreq=1768, maxDocs=42306)
                0.01167753 = queryNorm
              0.26090714 = fieldWeight in 846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1745143 = idf(docFreq=1768, maxDocs=42306)
                0.0625 = fieldNorm(doc=846)
          0.06166267 = weight(abstract_txt:pattern in 846) [ClassicSimilarity], result of:
            0.06166267 = score(doc=846,freq=1.0), product of:
              0.15774232 = queryWeight, product of:
                2.1597486 = boost
                6.2545214 = idf(docFreq=220, maxDocs=42306)
                0.01167753 = queryNorm
              0.3909076 = fieldWeight in 846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2545214 = idf(docFreq=220, maxDocs=42306)
                0.0625 = fieldNorm(doc=846)
          0.37899768 = weight(abstract_txt:algorithm in 846) [ClassicSimilarity], result of:
            0.37899768 = score(doc=846,freq=4.0), product of:
              0.5292868 = queryWeight, product of:
                7.9123335 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01167753 = queryNorm
              0.71605355 = fieldWeight in 846, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=846)
        0.2 = coord(5/25)