Document (#6682)

Editor
Thomas, D.W.
Author
Driscoll, J.R.
Rajala, D.A.
Shaffer, W.H.
Title
¬The operation and performance of an artificially intelligent keywording system
Source
Information processing and management. 27(1991) no.1, S.43-54
Year
1991
Abstract
Presents a new approach to text analysis for automating the key phrase indexing process, using artificial intelligence techniques. This mimics the behaviour of human experts by using a rule base consisting of insertion and deletion rules generated by subject-matter experts. The insertion rules are based on the idea that some phrases found in a text imply or trigger other phrases. The deletion rules apply to semantically ambiguous phrases where text presence alone does not determine appropriateness as a key phrase. The insertion and deletion rules are used to transform a list of found phrases to a list of key phrases for indexing a document. Statistical data are provided to demonstrate the performance of this expert rule based system
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.21
    0.21081999 = sum of:
      0.21081999 = product of:
        0.58561105 = sum of:
          0.009322536 = weight(abstract_txt:system in 2895) [ClassicSimilarity], result of:
            0.009322536 = score(doc=2895,freq=1.0), product of:
              0.044231 = queryWeight, product of:
                1.0325341 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.012702672 = queryNorm
              0.21076928 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.017486982 = weight(abstract_txt:using in 2895) [ClassicSimilarity], result of:
            0.017486982 = score(doc=2895,freq=3.0), product of:
              0.04664519 = queryWeight, product of:
                1.0603383 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.012702672 = queryNorm
              0.37489358 = fieldWeight in 2895, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.028288595 = weight(abstract_txt:indexing in 2895) [ClassicSimilarity], result of:
            0.028288595 = score(doc=2895,freq=2.0), product of:
              0.07358144 = queryWeight, product of:
                1.3317575 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.012702672 = queryNorm
              0.38445285 = fieldWeight in 2895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.024133116 = weight(abstract_txt:performance in 2895) [ClassicSimilarity], result of:
            0.024133116 = score(doc=2895,freq=1.0), product of:
              0.08338981 = queryWeight, product of:
                1.417743 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.012702672 = queryNorm
              0.28940126 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.024112036 = weight(abstract_txt:text in 2895) [ClassicSimilarity], result of:
            0.024112036 = score(doc=2895,freq=1.0), product of:
              0.095401905 = queryWeight, product of:
                1.8572277 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012702672 = queryNorm
              0.25274166 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.0995295 = weight(abstract_txt:rule in 2895) [ClassicSimilarity], result of:
            0.0995295 = score(doc=2895,freq=2.0), product of:
              0.170214 = queryWeight, product of:
                2.0255299 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.012702672 = queryNorm
              0.5847316 = fieldWeight in 2895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.08706659 = weight(abstract_txt:phrase in 2895) [ClassicSimilarity], result of:
            0.08706659 = score(doc=2895,freq=1.0), product of:
              0.19615757 = queryWeight, product of:
                2.1744206 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.012702672 = queryNorm
              0.44386047 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.09875081 = weight(abstract_txt:rules in 2895) [ClassicSimilarity], result of:
            0.09875081 = score(doc=2895,freq=2.0), product of:
              0.21333617 = queryWeight, product of:
                3.2069209 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012702672 = queryNorm
              0.46288824 = fieldWeight in 2895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.19692092 = weight(abstract_txt:phrases in 2895) [ClassicSimilarity], result of:
            0.19692092 = score(doc=2895,freq=1.0), product of:
              0.45871747 = queryWeight, product of:
                5.257553 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.012702672 = queryNorm
              0.42928585 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
        0.36 = coord(9/25)
    
  2. Craven, T.C.: Adapting of string indexing systems for retrieval using proximity operators (1988) 0.17
    0.1665478 = sum of:
      0.1665478 = product of:
        0.832739 = sum of:
          0.016314438 = weight(abstract_txt:system in 705) [ClassicSimilarity], result of:
            0.016314438 = score(doc=705,freq=1.0), product of:
              0.044231 = queryWeight, product of:
                1.0325341 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.012702672 = queryNorm
              0.36884624 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.109375 = fieldNorm(doc=705)
          0.017668199 = weight(abstract_txt:using in 705) [ClassicSimilarity], result of:
            0.017668199 = score(doc=705,freq=1.0), product of:
              0.04664519 = queryWeight, product of:
                1.0603383 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.012702672 = queryNorm
              0.37877858 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.109375 = fieldNorm(doc=705)
          0.049505044 = weight(abstract_txt:indexing in 705) [ClassicSimilarity], result of:
            0.049505044 = score(doc=705,freq=2.0), product of:
              0.07358144 = queryWeight, product of:
                1.3317575 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.012702672 = queryNorm
              0.6727925 = fieldWeight in 705, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.109375 = fieldNorm(doc=705)
          0.15236653 = weight(abstract_txt:phrase in 705) [ClassicSimilarity], result of:
            0.15236653 = score(doc=705,freq=1.0), product of:
              0.19615757 = queryWeight, product of:
                2.1744206 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.012702672 = queryNorm
              0.7767558 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.109375 = fieldNorm(doc=705)
          0.5968848 = weight(abstract_txt:phrases in 705) [ClassicSimilarity], result of:
            0.5968848 = score(doc=705,freq=3.0), product of:
              0.45871747 = queryWeight, product of:
                5.257553 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.012702672 = queryNorm
              1.3012035 = fieldWeight in 705, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.109375 = fieldNorm(doc=705)
        0.2 = coord(5/25)
    
  3. Losee, R.: ¬A performance model of the length and number of subject headings and index phrases (2004) 0.16
    0.16229704 = sum of:
      0.16229704 = product of:
        0.8114852 = sum of:
          0.035360742 = weight(abstract_txt:indexing in 3725) [ClassicSimilarity], result of:
            0.035360742 = score(doc=3725,freq=2.0), product of:
              0.07358144 = queryWeight, product of:
                1.3317575 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.012702672 = queryNorm
              0.48056605 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.042624462 = weight(abstract_txt:text in 3725) [ClassicSimilarity], result of:
            0.042624462 = score(doc=3725,freq=2.0), product of:
              0.095401905 = queryWeight, product of:
                1.8572277 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012702672 = queryNorm
              0.44678837 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.15391344 = weight(abstract_txt:phrase in 3725) [ClassicSimilarity], result of:
            0.15391344 = score(doc=3725,freq=2.0), product of:
              0.19615757 = queryWeight, product of:
                2.1744206 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.012702672 = queryNorm
              0.78464186 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.08728421 = weight(abstract_txt:rules in 3725) [ClassicSimilarity], result of:
            0.08728421 = score(doc=3725,freq=1.0), product of:
              0.21333617 = queryWeight, product of:
                3.2069209 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012702672 = queryNorm
              0.40913928 = fieldWeight in 3725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.4923023 = weight(abstract_txt:phrases in 3725) [ClassicSimilarity], result of:
            0.4923023 = score(doc=3725,freq=4.0), product of:
              0.45871747 = queryWeight, product of:
                5.257553 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.012702672 = queryNorm
              1.0732147 = fieldWeight in 3725, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
        0.2 = coord(5/25)
    
  4. Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.16
    0.15904336 = sum of:
      0.15904336 = product of:
        0.6626807 = sum of:
          0.010096114 = weight(abstract_txt:using in 1845) [ClassicSimilarity], result of:
            0.010096114 = score(doc=1845,freq=1.0), product of:
              0.04664519 = queryWeight, product of:
                1.0603383 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.012702672 = queryNorm
              0.21644491 = fieldWeight in 1845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.040006116 = weight(abstract_txt:indexing in 1845) [ClassicSimilarity], result of:
            0.040006116 = score(doc=1845,freq=4.0), product of:
              0.07358144 = queryWeight, product of:
                1.3317575 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.012702672 = queryNorm
              0.54369843 = fieldWeight in 1845, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.024133116 = weight(abstract_txt:performance in 1845) [ClassicSimilarity], result of:
            0.024133116 = score(doc=1845,freq=1.0), product of:
              0.08338981 = queryWeight, product of:
                1.417743 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.012702672 = queryNorm
              0.28940126 = fieldWeight in 1845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.034099568 = weight(abstract_txt:text in 1845) [ClassicSimilarity], result of:
            0.034099568 = score(doc=1845,freq=2.0), product of:
              0.095401905 = queryWeight, product of:
                1.8572277 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012702672 = queryNorm
              0.3574307 = fieldWeight in 1845, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.21326874 = weight(abstract_txt:phrase in 1845) [ClassicSimilarity], result of:
            0.21326874 = score(doc=1845,freq=6.0), product of:
              0.19615757 = queryWeight, product of:
                2.1744206 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.012702672 = queryNorm
              1.0872318 = fieldWeight in 1845, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.34107703 = weight(abstract_txt:phrases in 1845) [ClassicSimilarity], result of:
            0.34107703 = score(doc=1845,freq=3.0), product of:
              0.45871747 = queryWeight, product of:
                5.257553 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.012702672 = queryNorm
              0.7435449 = fieldWeight in 1845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
        0.24 = coord(6/25)
    
  5. Frohmann, B.: Rules of indexing : a critique of mentalism in information retrieval theory (1990) 0.13
    0.13080136 = sum of:
      0.13080136 = product of:
        0.5450057 = sum of:
          0.050812688 = weight(abstract_txt:operation in 3908) [ClassicSimilarity], result of:
            0.050812688 = score(doc=3908,freq=1.0), product of:
              0.082975134 = queryWeight, product of:
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.012702672 = queryNorm
              0.6123845 = fieldWeight in 3908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
          0.030004585 = weight(abstract_txt:indexing in 3908) [ClassicSimilarity], result of:
            0.030004585 = score(doc=3908,freq=1.0), product of:
              0.07358144 = queryWeight, product of:
                1.3317575 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.012702672 = queryNorm
              0.40777382 = fieldWeight in 3908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
          0.036168054 = weight(abstract_txt:text in 3908) [ClassicSimilarity], result of:
            0.036168054 = score(doc=3908,freq=1.0), product of:
              0.095401905 = queryWeight, product of:
                1.8572277 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012702672 = queryNorm
              0.37911248 = fieldWeight in 3908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
          0.14929424 = weight(abstract_txt:rule in 3908) [ClassicSimilarity], result of:
            0.14929424 = score(doc=3908,freq=2.0), product of:
              0.170214 = queryWeight, product of:
                2.0255299 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.012702672 = queryNorm
              0.87709737 = fieldWeight in 3908, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
          0.13059989 = weight(abstract_txt:phrase in 3908) [ClassicSimilarity], result of:
            0.13059989 = score(doc=3908,freq=1.0), product of:
              0.19615757 = queryWeight, product of:
                2.1744206 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.012702672 = queryNorm
              0.6657907 = fieldWeight in 3908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
          0.1481262 = weight(abstract_txt:rules in 3908) [ClassicSimilarity], result of:
            0.1481262 = score(doc=3908,freq=2.0), product of:
              0.21333617 = queryWeight, product of:
                3.2069209 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012702672 = queryNorm
              0.69433236 = fieldWeight in 3908, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.09375 = fieldNorm(doc=3908)
        0.24 = coord(6/25)