Document (#32950)

Author
Steinberger, J.
Poesio, M.
Kabadjov, M.A.
Jezek, K.
Title
Two uses of anaphora resolution in summarization
Source
Information processing and management. 43(2007) no.6, S.1663-1680
Year
2007
Abstract
We propose a new method for using anaphoric information in Latent Semantic Analysis (lsa), and discuss its application to develop an lsa-based summarizer which achieves a significantly better performance than a system not using anaphoric information, and a better performance by the rouge measure than all but one of the single-document summarizers participating in DUC-2002. Anaphoric information is automatically extracted using a new release of our own anaphora resolution system, guitar, which incorporates proper noun resolution. Our summarizer also includes a new approach for automatically identifying the dimensionality reduction of a document on the basis of the desired summarization percentage. Anaphoric information is also used to check the coherence of the summary produced by our summarizer, by a reference checker module which identifies anaphoric resolution errors caused by sentence extraction.
Theme
Automatisches Abstracting

Similar documents (content)

  1. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.25
    0.24760029 = sum of:
      0.24760029 = product of:
        0.7737509 = sum of:
          0.021365607 = weight(abstract_txt:document in 2693) [ClassicSimilarity], result of:
            0.021365607 = score(doc=2693,freq=1.0), product of:
              0.07963683 = queryWeight, product of:
                1.2071588 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015368387 = queryNorm
              0.26828802 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.010053668 = weight(abstract_txt:which in 2693) [ClassicSimilarity], result of:
            0.010053668 = score(doc=2693,freq=1.0), product of:
              0.055150643 = queryWeight, product of:
                1.2303491 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.015368387 = queryNorm
              0.18229467 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.100367695 = weight(abstract_txt:rouge in 2693) [ClassicSimilarity], result of:
            0.100367695 = score(doc=2693,freq=1.0), product of:
              0.17729226 = queryWeight, product of:
                1.2736125 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.015368387 = queryNorm
              0.56611437 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.026817156 = weight(abstract_txt:performance in 2693) [ClassicSimilarity], result of:
            0.026817156 = score(doc=2693,freq=1.0), product of:
              0.092664264 = queryWeight, product of:
                1.3021576 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015368387 = queryNorm
              0.28940126 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.02917787 = weight(abstract_txt:better in 2693) [ClassicSimilarity], result of:
            0.02917787 = score(doc=2693,freq=1.0), product of:
              0.098025605 = queryWeight, product of:
                1.3392979 = boost
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.015368387 = queryNorm
              0.2976556 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.029147774 = weight(abstract_txt:using in 2693) [ClassicSimilarity], result of:
            0.029147774 = score(doc=2693,freq=3.0), product of:
              0.07774946 = queryWeight, product of:
                1.460837 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015368387 = queryNorm
              0.37489358 = fieldWeight in 2693, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.24008195 = weight(abstract_txt:summarization in 2693) [ClassicSimilarity], result of:
            0.24008195 = score(doc=2693,freq=6.0), product of:
              0.21986683 = queryWeight, product of:
                2.0057983 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.015368387 = queryNorm
              1.0919425 = fieldWeight in 2693, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.31673917 = weight(abstract_txt:summarizer in 2693) [ClassicSimilarity], result of:
            0.31673917 = score(doc=2693,freq=1.0), product of:
              0.5501343 = queryWeight, product of:
                3.8858624 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015368387 = queryNorm
              0.5757488 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
        0.32 = coord(8/25)
    
  2. Wu, D.-S.; Liang, T.: Chinese pronominal anaphora resolution using lexical knowledge and entropy-based weight (2008) 0.25
    0.2465174 = sum of:
      0.2465174 = product of:
        1.5407338 = sum of:
          0.040225733 = weight(abstract_txt:performance in 2367) [ClassicSimilarity], result of:
            0.040225733 = score(doc=2367,freq=1.0), product of:
              0.092664264 = queryWeight, product of:
                1.3021576 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015368387 = queryNorm
              0.43410188 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.09375 = fieldNorm(doc=2367)
          0.025242712 = weight(abstract_txt:using in 2367) [ClassicSimilarity], result of:
            0.025242712 = score(doc=2367,freq=1.0), product of:
              0.07774946 = queryWeight, product of:
                1.460837 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015368387 = queryNorm
              0.32466736 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.09375 = fieldNorm(doc=2367)
          1.0500057 = weight(title_txt:anaphora in 2367) [ClassicSimilarity], result of:
            1.0500057 = score(doc=2367,freq=1.0), product of:
              0.42402512 = queryWeight, product of:
                2.7855003 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.015368387 = queryNorm
              2.476282 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.25 = fieldNorm(doc=2367)
          0.42525977 = weight(abstract_txt:resolution in 2367) [ClassicSimilarity], result of:
            0.42525977 = score(doc=2367,freq=2.0), product of:
              0.44635373 = queryWeight, product of:
                4.041681 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.015368387 = queryNorm
              0.9527416 = fieldWeight in 2367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.09375 = fieldNorm(doc=2367)
        0.16 = coord(4/25)
    
  3. Haag, M.: Automatic text summarization : Evaluation des Copernic Summarizer und mögliche Einsatzfelder in der Fachinformation der DaimlerCrysler AG (2002) 0.21
    0.21019398 = sum of:
      0.21019398 = product of:
        1.0509698 = sum of:
          0.012567085 = weight(abstract_txt:which in 649) [ClassicSimilarity], result of:
            0.012567085 = score(doc=649,freq=1.0), product of:
              0.055150643 = queryWeight, product of:
                1.2303491 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.015368387 = queryNorm
              0.22786833 = fieldWeight in 649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.078125 = fieldNorm(doc=649)
          0.016596159 = weight(abstract_txt:information in 649) [ClassicSimilarity], result of:
            0.016596159 = score(doc=649,freq=3.0), product of:
              0.050660767 = queryWeight, product of:
                1.3616275 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015368387 = queryNorm
              0.32759392 = fieldWeight in 649, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=649)
          0.056694355 = weight(abstract_txt:automatically in 649) [ClassicSimilarity], result of:
            0.056694355 = score(doc=649,freq=1.0), product of:
              0.13153975 = queryWeight, product of:
                1.5514433 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.015368387 = queryNorm
              0.4310055 = fieldWeight in 649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.078125 = fieldNorm(doc=649)
          0.17326422 = weight(abstract_txt:summarization in 649) [ClassicSimilarity], result of:
            0.17326422 = score(doc=649,freq=2.0), product of:
              0.21986683 = queryWeight, product of:
                2.0057983 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.015368387 = queryNorm
              0.78804165 = fieldWeight in 649, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.078125 = fieldNorm(doc=649)
          0.79184794 = weight(abstract_txt:summarizer in 649) [ClassicSimilarity], result of:
            0.79184794 = score(doc=649,freq=4.0), product of:
              0.5501343 = queryWeight, product of:
                3.8858624 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015368387 = queryNorm
              1.4393721 = fieldWeight in 649, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=649)
        0.2 = coord(5/25)
    
  4. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.18
    0.17792308 = sum of:
      0.17792308 = product of:
        0.4448077 = sum of:
          0.049405776 = weight(abstract_txt:achieves in 947) [ClassicSimilarity], result of:
            0.049405776 = score(doc=947,freq=1.0), product of:
              0.12082039 = queryWeight, product of:
                1.0513868 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.015368387 = queryNorm
              0.4089192 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.0139672505 = weight(abstract_txt:than in 947) [ClassicSimilarity], result of:
            0.0139672505 = score(doc=947,freq=1.0), product of:
              0.06557008 = queryWeight, product of:
                1.0953686 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.015368387 = queryNorm
              0.21301256 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.02643859 = weight(abstract_txt:document in 947) [ClassicSimilarity], result of:
            0.02643859 = score(doc=947,freq=2.0), product of:
              0.07963683 = queryWeight, product of:
                1.2071588 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015368387 = queryNorm
              0.3319895 = fieldWeight in 947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.00879696 = weight(abstract_txt:which in 947) [ClassicSimilarity], result of:
            0.00879696 = score(doc=947,freq=1.0), product of:
              0.055150643 = queryWeight, product of:
                1.2303491 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.015368387 = queryNorm
              0.15950784 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.08782173 = weight(abstract_txt:rouge in 947) [ClassicSimilarity], result of:
            0.08782173 = score(doc=947,freq=1.0), product of:
              0.17729226 = queryWeight, product of:
                1.2736125 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.015368387 = queryNorm
              0.49535006 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.033184536 = weight(abstract_txt:performance in 947) [ClassicSimilarity], result of:
            0.033184536 = score(doc=947,freq=2.0), product of:
              0.092664264 = queryWeight, product of:
                1.3021576 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.015368387 = queryNorm
              0.3581158 = fieldWeight in 947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.025530638 = weight(abstract_txt:better in 947) [ClassicSimilarity], result of:
            0.025530638 = score(doc=947,freq=1.0), product of:
              0.098025605 = queryWeight, product of:
                1.3392979 = boost
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.015368387 = queryNorm
              0.26044866 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.013414516 = weight(abstract_txt:information in 947) [ClassicSimilarity], result of:
            0.013414516 = score(doc=947,freq=4.0), product of:
              0.050660767 = queryWeight, product of:
                1.3616275 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015368387 = queryNorm
              0.264791 = fieldWeight in 947, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.014724915 = weight(abstract_txt:using in 947) [ClassicSimilarity], result of:
            0.014724915 = score(doc=947,freq=1.0), product of:
              0.07774946 = queryWeight, product of:
                1.460837 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015368387 = queryNorm
              0.18938929 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.17152283 = weight(abstract_txt:summarization in 947) [ClassicSimilarity], result of:
            0.17152283 = score(doc=947,freq=4.0), product of:
              0.21986683 = queryWeight, product of:
                2.0057983 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.015368387 = queryNorm
              0.78012145 = fieldWeight in 947, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
        0.4 = coord(10/25)
    
  5. Aker, A.; Gaizauskas, R.: Generating descriptive multi-document summaries of geo-located entities using entity type models (2015) 0.17
    0.17292516 = sum of:
      0.17292516 = product of:
        0.8646258 = sum of:
          0.037006315 = weight(abstract_txt:document in 1726) [ClassicSimilarity], result of:
            0.037006315 = score(doc=1726,freq=3.0), product of:
              0.07963683 = queryWeight, product of:
                1.2071588 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015368387 = queryNorm
              0.46468848 = fieldWeight in 1726, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=1726)
          0.037629616 = weight(abstract_txt:using in 1726) [ClassicSimilarity], result of:
            0.037629616 = score(doc=1726,freq=5.0), product of:
              0.07774946 = queryWeight, product of:
                1.460837 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015368387 = queryNorm
              0.48398554 = fieldWeight in 1726, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1726)
          0.045355484 = weight(abstract_txt:automatically in 1726) [ClassicSimilarity], result of:
            0.045355484 = score(doc=1726,freq=1.0), product of:
              0.13153975 = queryWeight, product of:
                1.5514433 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.015368387 = queryNorm
              0.3448044 = fieldWeight in 1726, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.0625 = fieldNorm(doc=1726)
          0.19602609 = weight(abstract_txt:summarization in 1726) [ClassicSimilarity], result of:
            0.19602609 = score(doc=1726,freq=4.0), product of:
              0.21986683 = queryWeight, product of:
                2.0057983 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.015368387 = queryNorm
              0.89156735 = fieldWeight in 1726, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=1726)
          0.5486083 = weight(abstract_txt:summarizer in 1726) [ClassicSimilarity], result of:
            0.5486083 = score(doc=1726,freq=3.0), product of:
              0.5501343 = queryWeight, product of:
                3.8858624 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015368387 = queryNorm
              0.9972262 = fieldWeight in 1726, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=1726)
        0.2 = coord(5/25)