Document (#34451)

Author
Otterbacher, J.
Erkan, G.
Radev, D.R.
Title
Biased LexRank : passage retrieval using random walks with question-based priors
Source
Information processing and management. 45(2009) no.1, S.42-54
Year
2009
Abstract
We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user's natural language question. We then perform a random walk on the lexical similarity graph in order to recursively retrieve additional passages that are similar to other relevant passages. We present results on several benchmarks that show the applicability of our work to question answering and topic-focused text summarization.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 4.57
    4.565969 = sum of:
      4.565969 = weight(author_txt:radev in 2210) [ClassicSimilarity], result of:
        4.565969 = fieldWeight in 2210, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.5 = fieldNorm(doc=2210)
    
  2. Finegan-Dollak, C.; Radev, D.R.: Sentence simplification, compression, and disaggregation for summarization of sophisticated documents (2016) 4.00
    3.9952228 = sum of:
      3.9952228 = weight(author_txt:radev in 3122) [ClassicSimilarity], result of:
        3.9952228 = fieldWeight in 3122, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.4375 = fieldNorm(doc=3122)
    
  3. Radev, D.R.; Libner, K.; Fan, W.: Getting answers to natural language questions on the Web (2002) 3.42
    3.4244766 = sum of:
      3.4244766 = weight(author_txt:radev in 5204) [ClassicSimilarity], result of:
        3.4244766 = fieldWeight in 5204, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.375 = fieldNorm(doc=5204)
    
  4. Otterbacher, J.; Radev, D.; Kareem, O.: Hierarchical summarization for delivering information to mobile devices (2008) 3.42
    3.4244766 = sum of:
      3.4244766 = weight(author_txt:radev in 2071) [ClassicSimilarity], result of:
        3.4244766 = fieldWeight in 2071, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.375 = fieldNorm(doc=2071)
    
  5. Lam, W.; Chan, K.; Radev, D.; Saggion, H.; Teufel, S.: Context-based generic cross-lingual retrieval of documents and automated summaries (2005) 2.85
    2.8537307 = sum of:
      2.8537307 = weight(author_txt:radev in 1965) [ClassicSimilarity], result of:
        2.8537307 = fieldWeight in 1965, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.3125 = fieldNorm(doc=1965)
    

Similar documents (content)

  1. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.67
    0.6733477 = sum of:
      0.6733477 = product of:
        1.8704102 = sum of:
          0.007423792 = weight(abstract_txt:based in 2765) [ClassicSimilarity], result of:
            0.007423792 = score(doc=2765,freq=1.0), product of:
              0.042582314 = queryWeight, product of:
                1.039055 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012855301 = queryNorm
              0.1743398 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.009144883 = weight(abstract_txt:that in 2765) [ClassicSimilarity], result of:
            0.009144883 = score(doc=2765,freq=4.0), product of:
              0.035286445 = queryWeight, product of:
                1.1584399 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012855301 = queryNorm
              0.25916135 = fieldWeight in 2765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.03030546 = weight(abstract_txt:text in 2765) [ClassicSimilarity], result of:
            0.03030546 = score(doc=2765,freq=4.0), product of:
              0.06851821 = queryWeight, product of:
                1.3180349 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012855301 = queryNorm
              0.4422979 = fieldWeight in 2765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.018847406 = weight(abstract_txt:present in 2765) [ClassicSimilarity], result of:
            0.018847406 = score(doc=2765,freq=1.0), product of:
              0.07924645 = queryWeight, product of:
                1.4174699 = boost
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.012855301 = queryNorm
              0.23783283 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.022824699 = weight(abstract_txt:relevant in 2765) [ClassicSimilarity], result of:
            0.022824699 = score(doc=2765,freq=1.0), product of:
              0.09003584 = queryWeight, product of:
                1.5108857 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.012855301 = queryNorm
              0.2535068 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.032254722 = weight(abstract_txt:retrieval in 2765) [ClassicSimilarity], result of:
            0.032254722 = score(doc=2765,freq=5.0), product of:
              0.07590109 = queryWeight, product of:
                1.6990007 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012855301 = queryNorm
              0.4249573 = fieldWeight in 2765, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.045152247 = weight(abstract_txt:similarity in 2765) [ClassicSimilarity], result of:
            0.045152247 = score(doc=2765,freq=1.0), product of:
              0.14188342 = queryWeight, product of:
                1.8966612 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.012855301 = queryNorm
              0.31823483 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.7255264 = weight(abstract_txt:passage in 2765) [ClassicSimilarity], result of:
            0.7255264 = score(doc=2765,freq=14.0), product of:
              0.42910996 = queryWeight, product of:
                4.039744 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.012855301 = queryNorm
              1.6907703 = fieldWeight in 2765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.97893053 = weight(abstract_txt:passages in 2765) [ClassicSimilarity], result of:
            0.97893053 = score(doc=2765,freq=14.0), product of:
              0.57669646 = queryWeight, product of:
                5.4076996 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.012855301 = queryNorm
              1.6974797 = fieldWeight in 2765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
        0.36 = coord(9/25)
    
  2. Kaszkiel, M.; Zobel, J.: Effective ranking with arbitrary passages (2001) 0.25
    0.2514934 = sum of:
      0.2514934 = product of:
        1.0478891 = sum of:
          0.007390181 = weight(abstract_txt:that in 5764) [ClassicSimilarity], result of:
            0.007390181 = score(doc=5764,freq=2.0), product of:
              0.035286445 = queryWeight, product of:
                1.1584399 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012855301 = queryNorm
              0.20943399 = fieldWeight in 5764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
          0.029994626 = weight(abstract_txt:text in 5764) [ClassicSimilarity], result of:
            0.029994626 = score(doc=5764,freq=3.0), product of:
              0.06851821 = queryWeight, product of:
                1.3180349 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012855301 = queryNorm
              0.4377614 = fieldWeight in 5764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
          0.02608537 = weight(abstract_txt:relevant in 5764) [ClassicSimilarity], result of:
            0.02608537 = score(doc=5764,freq=1.0), product of:
              0.09003584 = queryWeight, product of:
                1.5108857 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.012855301 = queryNorm
              0.28972206 = fieldWeight in 5764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
          0.023313917 = weight(abstract_txt:retrieval in 5764) [ClassicSimilarity], result of:
            0.023313917 = score(doc=5764,freq=2.0), product of:
              0.07590109 = queryWeight, product of:
                1.6990007 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012855301 = queryNorm
              0.3071618 = fieldWeight in 5764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
          0.44321162 = weight(abstract_txt:passage in 5764) [ClassicSimilarity], result of:
            0.44321162 = score(doc=5764,freq=4.0), product of:
              0.42910996 = queryWeight, product of:
                4.039744 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.012855301 = queryNorm
              1.0328625 = fieldWeight in 5764, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
          0.51789343 = weight(abstract_txt:passages in 5764) [ClassicSimilarity], result of:
            0.51789343 = score(doc=5764,freq=3.0), product of:
              0.57669646 = queryWeight, product of:
                5.4076996 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.012855301 = queryNorm
              0.89803475 = fieldWeight in 5764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=5764)
        0.24 = coord(6/25)
    
  3. Melucci, M.: Passage retrieval : a probabilistic technique (1998) 0.24
    0.24325602 = sum of:
      0.24325602 = product of:
        1.0135667 = sum of:
          0.014998325 = weight(abstract_txt:based in 1150) [ClassicSimilarity], result of:
            0.014998325 = score(doc=1150,freq=2.0), product of:
              0.042582314 = queryWeight, product of:
                1.039055 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012855301 = queryNorm
              0.35221958 = fieldWeight in 1150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
          0.009237726 = weight(abstract_txt:that in 1150) [ClassicSimilarity], result of:
            0.009237726 = score(doc=1150,freq=2.0), product of:
              0.035286445 = queryWeight, product of:
                1.1584399 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012855301 = queryNorm
              0.26179248 = fieldWeight in 1150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
          0.048403624 = weight(abstract_txt:text in 1150) [ClassicSimilarity], result of:
            0.048403624 = score(doc=1150,freq=5.0), product of:
              0.06851821 = queryWeight, product of:
                1.3180349 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012855301 = queryNorm
              0.7064345 = fieldWeight in 1150, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
          0.020606786 = weight(abstract_txt:retrieval in 1150) [ClassicSimilarity], result of:
            0.020606786 = score(doc=1150,freq=1.0), product of:
              0.07590109 = queryWeight, product of:
                1.6990007 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012855301 = queryNorm
              0.27149525 = fieldWeight in 1150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
          0.39174742 = weight(abstract_txt:passage in 1150) [ClassicSimilarity], result of:
            0.39174742 = score(doc=1150,freq=2.0), product of:
              0.42910996 = queryWeight, product of:
                4.039744 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.012855301 = queryNorm
              0.91293013 = fieldWeight in 1150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
          0.5285728 = weight(abstract_txt:passages in 1150) [ClassicSimilarity], result of:
            0.5285728 = score(doc=1150,freq=2.0), product of:
              0.57669646 = queryWeight, product of:
                5.4076996 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.012855301 = queryNorm
              0.91655284 = fieldWeight in 1150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.078125 = fieldNorm(doc=1150)
        0.24 = coord(6/25)
    
  4. Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.23
    0.22845334 = sum of:
      0.22845334 = product of:
        0.9518889 = sum of:
          0.009237726 = weight(abstract_txt:that in 1162) [ClassicSimilarity], result of:
            0.009237726 = score(doc=1162,freq=2.0), product of:
              0.035286445 = queryWeight, product of:
                1.1584399 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012855301 = queryNorm
              0.26179248 = fieldWeight in 1162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.021646757 = weight(abstract_txt:text in 1162) [ClassicSimilarity], result of:
            0.021646757 = score(doc=1162,freq=1.0), product of:
              0.06851821 = queryWeight, product of:
                1.3180349 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012855301 = queryNorm
              0.3159271 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.06450321 = weight(abstract_txt:similarity in 1162) [ClassicSimilarity], result of:
            0.06450321 = score(doc=1162,freq=1.0), product of:
              0.14188342 = queryWeight, product of:
                1.8966612 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.012855301 = queryNorm
              0.4546212 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.09099639 = weight(abstract_txt:lexical in 1162) [ClassicSimilarity], result of:
            0.09099639 = score(doc=1162,freq=1.0), product of:
              0.17846793 = queryWeight, product of:
                2.127179 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.012855301 = queryNorm
              0.5098753 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.39174742 = weight(abstract_txt:passage in 1162) [ClassicSimilarity], result of:
            0.39174742 = score(doc=1162,freq=2.0), product of:
              0.42910996 = queryWeight, product of:
                4.039744 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.012855301 = queryNorm
              0.91293013 = fieldWeight in 1162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.3737574 = weight(abstract_txt:passages in 1162) [ClassicSimilarity], result of:
            0.3737574 = score(doc=1162,freq=1.0), product of:
              0.57669646 = queryWeight, product of:
                5.4076996 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.012855301 = queryNorm
              0.64810073 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
        0.24 = coord(6/25)
    
  5. Salton, G.: Automatic text structuring and summarization (1997) 0.22
    0.21718697 = sum of:
      0.21718697 = product of:
        0.90494573 = sum of:
          0.045378834 = weight(abstract_txt:perform in 145) [ClassicSimilarity], result of:
            0.045378834 = score(doc=145,freq=1.0), product of:
              0.07888277 = queryWeight, product of:
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.012855301 = queryNorm
              0.5752693 = fieldWeight in 145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
          0.012726502 = weight(abstract_txt:based in 145) [ClassicSimilarity], result of:
            0.012726502 = score(doc=145,freq=1.0), product of:
              0.042582314 = queryWeight, product of:
                1.039055 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012855301 = queryNorm
              0.29886824 = fieldWeight in 145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
          0.007838471 = weight(abstract_txt:that in 145) [ClassicSimilarity], result of:
            0.007838471 = score(doc=145,freq=1.0), product of:
              0.035286445 = queryWeight, product of:
                1.1584399 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012855301 = queryNorm
              0.22213829 = fieldWeight in 145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
          0.058084346 = weight(abstract_txt:text in 145) [ClassicSimilarity], result of:
            0.058084346 = score(doc=145,freq=5.0), product of:
              0.06851821 = queryWeight, product of:
                1.3180349 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.012855301 = queryNorm
              0.84772134 = fieldWeight in 145, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
          0.3324087 = weight(abstract_txt:passage in 145) [ClassicSimilarity], result of:
            0.3324087 = score(doc=145,freq=1.0), product of:
              0.42910996 = queryWeight, product of:
                4.039744 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.012855301 = queryNorm
              0.7746469 = fieldWeight in 145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
          0.4485089 = weight(abstract_txt:passages in 145) [ClassicSimilarity], result of:
            0.4485089 = score(doc=145,freq=1.0), product of:
              0.57669646 = queryWeight, product of:
                5.4076996 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.012855301 = queryNorm
              0.7777209 = fieldWeight in 145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.09375 = fieldNorm(doc=145)
        0.24 = coord(6/25)