Document (#19831)

Author
Tseng, Y.-H.
Title
Keyword extraction techniques and relevance feedback
Source
Bulletin of the Library Association of China. 1997, no.59, Dec., S.59-64
Year
1997
Abstract
Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm
Footnote
[In Chinesisch]
Theme
Indexierungsstudien

Similar documents (author)

  1. Tseng, Y.-H.: Automatic cataloguing and searching for retrospective data by use of OCR text (2001) 4.57
    4.565969 = sum of:
      4.565969 = weight(author_txt:tseng in 5421) [ClassicSimilarity], result of:
        4.565969 = fieldWeight in 5421, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.5 = fieldNorm(doc=5421)
    
  2. Tseng, Y.-H.: Solving vocabulary problems with interactive query expansion (1998) 4.57
    4.565969 = sum of:
      4.565969 = weight(author_txt:tseng in 5159) [ClassicSimilarity], result of:
        4.565969 = fieldWeight in 5159, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.5 = fieldNorm(doc=5159)
    
  3. Tseng, Y.H.; Lin, Y.I.: Evaluation of fuzzy search, term suggestion, and term relevance feedback in an OPAC system (1998) 4.57
    4.565969 = sum of:
      4.565969 = weight(author_txt:tseng in 6430) [ClassicSimilarity], result of:
        4.565969 = fieldWeight in 6430, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.5 = fieldNorm(doc=6430)
    
  4. Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002) 4.57
    4.565969 = sum of:
      4.565969 = weight(author_txt:tseng in 5226) [ClassicSimilarity], result of:
        4.565969 = fieldWeight in 5226, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.5 = fieldNorm(doc=5226)
    
  5. Drenth, H.; Morris, A.; Tseng, G.: Expert systems as information intermediaries (1991) 3.42
    3.4244766 = sum of:
      3.4244766 = weight(author_txt:tseng in 3695) [ClassicSimilarity], result of:
        3.4244766 = fieldWeight in 3695, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.375 = fieldNorm(doc=3695)
    

Similar documents (content)

  1. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.24
    0.23938298 = sum of:
      0.23938298 = product of:
        0.8549392 = sum of:
          0.03419602 = weight(abstract_txt:keywords in 6601) [ClassicSimilarity], result of:
            0.03419602 = score(doc=6601,freq=1.0), product of:
              0.09098758 = queryWeight, product of:
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.015131037 = queryNorm
              0.37583172 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.052447025 = weight(abstract_txt:extracting in 6601) [ClassicSimilarity], result of:
            0.052447025 = score(doc=6601,freq=1.0), product of:
              0.12100751 = queryWeight, product of:
                1.1532278 = boost
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.015131037 = queryNorm
              0.4334196 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.026566816 = weight(abstract_txt:results in 6601) [ClassicSimilarity], result of:
            0.026566816 = score(doc=6601,freq=4.0), product of:
              0.061030664 = queryWeight, product of:
                1.1582385 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015131037 = queryNorm
              0.43530276 = fieldWeight in 6601, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.030147364 = weight(abstract_txt:application in 6601) [ClassicSimilarity], result of:
            0.030147364 = score(doc=6601,freq=1.0), product of:
              0.10540017 = queryWeight, product of:
                1.522105 = boost
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.015131037 = queryNorm
              0.28602767 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.037730653 = weight(abstract_txt:relevance in 6601) [ClassicSimilarity], result of:
            0.037730653 = score(doc=6601,freq=1.0), product of:
              0.12240654 = queryWeight, product of:
                1.6403112 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.015131037 = queryNorm
              0.3082405 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.17304386 = weight(abstract_txt:keyword in 6601) [ClassicSimilarity], result of:
            0.17304386 = score(doc=6601,freq=1.0), product of:
              0.45859137 = queryWeight, product of:
                5.0200367 = boost
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.015131037 = queryNorm
              0.3773378 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.5008075 = weight(abstract_txt:extraction in 6601) [ClassicSimilarity], result of:
            0.5008075 = score(doc=6601,freq=5.0), product of:
              0.57877004 = queryWeight, product of:
                6.1778536 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.015131037 = queryNorm
              0.8652962 = fieldWeight in 6601, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
        0.28 = coord(7/25)
    
  2. Ercan, G.; Cicekli, I.: Using lexical chains for keyword extraction (2007) 0.21
    0.21094947 = sum of:
      0.21094947 = product of:
        1.0547473 = sum of:
          0.07254071 = weight(abstract_txt:keywords in 951) [ClassicSimilarity], result of:
            0.07254071 = score(doc=951,freq=2.0), product of:
              0.09098758 = queryWeight, product of:
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.015131037 = queryNorm
              0.79725945 = fieldWeight in 951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.09375 = fieldNorm(doc=951)
          0.019925114 = weight(abstract_txt:results in 951) [ClassicSimilarity], result of:
            0.019925114 = score(doc=951,freq=1.0), product of:
              0.061030664 = queryWeight, product of:
                1.1582385 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015131037 = queryNorm
              0.32647708 = fieldWeight in 951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=951)
          0.013314328 = weight(abstract_txt:that in 951) [ClassicSimilarity], result of:
            0.013314328 = score(doc=951,freq=2.0), product of:
              0.04238194 = queryWeight, product of:
                1.1821157 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015131037 = queryNorm
              0.314151 = fieldWeight in 951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=951)
          0.36708146 = weight(abstract_txt:keyword in 951) [ClassicSimilarity], result of:
            0.36708146 = score(doc=951,freq=2.0), product of:
              0.45859137 = queryWeight, product of:
                5.0200367 = boost
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.015131037 = queryNorm
              0.8004544 = fieldWeight in 951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.09375 = fieldNorm(doc=951)
          0.5818857 = weight(abstract_txt:extraction in 951) [ClassicSimilarity], result of:
            0.5818857 = score(doc=951,freq=3.0), product of:
              0.57877004 = queryWeight, product of:
                6.1778536 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.015131037 = queryNorm
              1.0053833 = fieldWeight in 951, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.09375 = fieldNorm(doc=951)
        0.2 = coord(5/25)
    
  3. Ning, X.; Jin, H.; Jia, W.; Yuan, P.: Practical and effective IR-style keyword search over semantic web (2009) 0.17
    0.16849141 = sum of:
      0.16849141 = product of:
        0.601755 = sum of:
          0.05129403 = weight(abstract_txt:keywords in 4213) [ClassicSimilarity], result of:
            0.05129403 = score(doc=4213,freq=1.0), product of:
              0.09098758 = queryWeight, product of:
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.015131037 = queryNorm
              0.5637476 = fieldWeight in 4213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.017521467 = weight(abstract_txt:data in 4213) [ClassicSimilarity], result of:
            0.017521467 = score(doc=4213,freq=1.0), product of:
              0.056018036 = queryWeight, product of:
                1.1096548 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015131037 = queryNorm
              0.31278262 = fieldWeight in 4213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.019925114 = weight(abstract_txt:results in 4213) [ClassicSimilarity], result of:
            0.019925114 = score(doc=4213,freq=1.0), product of:
              0.061030664 = queryWeight, product of:
                1.1582385 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015131037 = queryNorm
              0.32647708 = fieldWeight in 4213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.016306654 = weight(abstract_txt:that in 4213) [ClassicSimilarity], result of:
            0.016306654 = score(doc=4213,freq=3.0), product of:
              0.04238194 = queryWeight, product of:
                1.1821157 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015131037 = queryNorm
              0.38475478 = fieldWeight in 4213, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.042002916 = weight(abstract_txt:retrieval in 4213) [ClassicSimilarity], result of:
            0.042002916 = score(doc=4213,freq=2.0), product of:
              0.09116349 = queryWeight, product of:
                1.7337244 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015131037 = queryNorm
              0.4607427 = fieldWeight in 4213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.08762339 = weight(abstract_txt:algorithm in 4213) [ClassicSimilarity], result of:
            0.08762339 = score(doc=4213,freq=1.0), product of:
              0.16381775 = queryWeight, product of:
                1.8975989 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.015131037 = queryNorm
              0.5348834 = fieldWeight in 4213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
          0.36708146 = weight(abstract_txt:keyword in 4213) [ClassicSimilarity], result of:
            0.36708146 = score(doc=4213,freq=2.0), product of:
              0.45859137 = queryWeight, product of:
                5.0200367 = boost
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.015131037 = queryNorm
              0.8004544 = fieldWeight in 4213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.09375 = fieldNorm(doc=4213)
        0.28 = coord(7/25)
    
  4. Semantic keyword-based search on structured data sources : COST Action IC1302. Second International KEYSTONE Conference, IKC 2016, Cluj-Napoca, Romania, September 8-9, 2016, Revised Selected Papers (2017) 0.16
    0.15774405 = sum of:
      0.15774405 = product of:
        0.98590034 = sum of:
          0.020441713 = weight(abstract_txt:data in 3479) [ClassicSimilarity], result of:
            0.020441713 = score(doc=3479,freq=1.0), product of:
              0.056018036 = queryWeight, product of:
                1.1096548 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015131037 = queryNorm
              0.36491305 = fieldWeight in 3479, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.109375 = fieldNorm(doc=3479)
          0.0490034 = weight(abstract_txt:retrieval in 3479) [ClassicSimilarity], result of:
            0.0490034 = score(doc=3479,freq=2.0), product of:
              0.09116349 = queryWeight, product of:
                1.7337244 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015131037 = queryNorm
              0.53753316 = fieldWeight in 3479, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.109375 = fieldNorm(doc=3479)
          0.52451134 = weight(abstract_txt:keyword in 3479) [ClassicSimilarity], result of:
            0.52451134 = score(doc=3479,freq=3.0), product of:
              0.45859137 = queryWeight, product of:
                5.0200367 = boost
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.015131037 = queryNorm
              1.1437445 = fieldWeight in 3479, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.109375 = fieldNorm(doc=3479)
          0.39194387 = weight(abstract_txt:extraction in 3479) [ClassicSimilarity], result of:
            0.39194387 = score(doc=3479,freq=1.0), product of:
              0.57877004 = queryWeight, product of:
                6.1778536 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.015131037 = queryNorm
              0.6772014 = fieldWeight in 3479, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.109375 = fieldNorm(doc=3479)
        0.16 = coord(4/25)
    
  5. Colace, F.; Santo, M. de; Greco, L.; Napoletano, P.: Improving relevance feedback-based query expansion by the use of a weighted word pairs approach (2015) 0.15
    0.14863822 = sum of:
      0.14863822 = product of:
        0.61932594 = sum of:
          0.016604261 = weight(abstract_txt:results in 2263) [ClassicSimilarity], result of:
            0.016604261 = score(doc=2263,freq=1.0), product of:
              0.061030664 = queryWeight, product of:
                1.1582385 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015131037 = queryNorm
              0.27206424 = fieldWeight in 2263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
          0.007845543 = weight(abstract_txt:that in 2263) [ClassicSimilarity], result of:
            0.007845543 = score(doc=2263,freq=1.0), product of:
              0.04238194 = queryWeight, product of:
                1.1821157 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015131037 = queryNorm
              0.18511525 = fieldWeight in 2263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
          0.04716332 = weight(abstract_txt:relevance in 2263) [ClassicSimilarity], result of:
            0.04716332 = score(doc=2263,freq=1.0), product of:
              0.12240654 = queryWeight, product of:
                1.6403112 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.015131037 = queryNorm
              0.38530064 = fieldWeight in 2263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
          0.03500243 = weight(abstract_txt:retrieval in 2263) [ClassicSimilarity], result of:
            0.03500243 = score(doc=2263,freq=2.0), product of:
              0.09116349 = queryWeight, product of:
                1.7337244 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015131037 = queryNorm
              0.38395226 = fieldWeight in 2263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
          0.116787314 = weight(abstract_txt:feedback in 2263) [ClassicSimilarity], result of:
            0.116787314 = score(doc=2263,freq=2.0), product of:
              0.17782336 = queryWeight, product of:
                1.9770532 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.015131037 = queryNorm
              0.6567602 = fieldWeight in 2263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
          0.39592308 = weight(abstract_txt:extraction in 2263) [ClassicSimilarity], result of:
            0.39592308 = score(doc=2263,freq=2.0), product of:
              0.57877004 = queryWeight, product of:
                6.1778536 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.015131037 = queryNorm
              0.68407667 = fieldWeight in 2263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.078125 = fieldNorm(doc=2263)
        0.24 = coord(6/25)