Document (#32933)

Author
Sparck Jones, K.
Title
Automatic summarising : the state of the art
Source
Information processing and management. 43(2007) no.6, S.1449-1481
Year
2007
Abstract
This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.
Theme
Automatisches Abstracting

Similar documents (author)

  1. Sparck Jones, K.: Fashionable trends and feasible strategies in information management (1988) 5.31
    5.3113213 = sum of:
      5.3113213 = sum of:
        2.0544336 = weight(author_txt:jones in 817) [ClassicSimilarity], result of:
          2.0544336 = score(doc=817,freq=1.0), product of:
            0.5925071 = queryWeight, product of:
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.085440755 = queryNorm
            3.4673567 = fieldWeight in 817, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.5 = fieldNorm(doc=817)
        3.2568877 = weight(author_txt:sparck in 817) [ClassicSimilarity], result of:
          3.2568877 = score(doc=817,freq=1.0), product of:
            0.80556524 = queryWeight, product of:
              1.1660135 = boost
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.085440755 = queryNorm
            4.0429845 = fieldWeight in 817, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.5 = fieldNorm(doc=817)
    
  2. Sparck Jones, K.: Automatic classification (1976) 5.31
    5.3113213 = sum of:
      5.3113213 = sum of:
        2.0544336 = weight(author_txt:jones in 2908) [ClassicSimilarity], result of:
          2.0544336 = score(doc=2908,freq=1.0), product of:
            0.5925071 = queryWeight, product of:
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.085440755 = queryNorm
            3.4673567 = fieldWeight in 2908, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.5 = fieldNorm(doc=2908)
        3.2568877 = weight(author_txt:sparck in 2908) [ClassicSimilarity], result of:
          3.2568877 = score(doc=2908,freq=1.0), product of:
            0.80556524 = queryWeight, product of:
              1.1660135 = boost
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.085440755 = queryNorm
            4.0429845 = fieldWeight in 2908, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.5 = fieldNorm(doc=2908)
    
  3. Sparck Jones, K.: ¬The role of artificial intelligence in information retrieval (1991) 5.31
    5.3113213 = sum of:
      5.3113213 = sum of:
        2.0544336 = weight(author_txt:jones in 4811) [ClassicSimilarity], result of:
          2.0544336 = score(doc=4811,freq=1.0), product of:
            0.5925071 = queryWeight, product of:
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.085440755 = queryNorm
            3.4673567 = fieldWeight in 4811, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.5 = fieldNorm(doc=4811)
        3.2568877 = weight(author_txt:sparck in 4811) [ClassicSimilarity], result of:
          3.2568877 = score(doc=4811,freq=1.0), product of:
            0.80556524 = queryWeight, product of:
              1.1660135 = boost
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.085440755 = queryNorm
            4.0429845 = fieldWeight in 4811, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.5 = fieldNorm(doc=4811)
    
  4. Sparck Jones, K.: Automatic keyword classification for information retrieval (1971) 5.31
    5.3113213 = sum of:
      5.3113213 = sum of:
        2.0544336 = weight(author_txt:jones in 5176) [ClassicSimilarity], result of:
          2.0544336 = score(doc=5176,freq=1.0), product of:
            0.5925071 = queryWeight, product of:
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.085440755 = queryNorm
            3.4673567 = fieldWeight in 5176, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.5 = fieldNorm(doc=5176)
        3.2568877 = weight(author_txt:sparck in 5176) [ClassicSimilarity], result of:
          3.2568877 = score(doc=5176,freq=1.0), product of:
            0.80556524 = queryWeight, product of:
              1.1660135 = boost
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.085440755 = queryNorm
            4.0429845 = fieldWeight in 5176, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.5 = fieldNorm(doc=5176)
    
  5. Sparck Jones, K.: ¬A statistical interpretation of term specifity and its application in retrieval (1972) 5.31
    5.3113213 = sum of:
      5.3113213 = sum of:
        2.0544336 = weight(author_txt:jones in 5187) [ClassicSimilarity], result of:
          2.0544336 = score(doc=5187,freq=1.0), product of:
            0.5925071 = queryWeight, product of:
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.085440755 = queryNorm
            3.4673567 = fieldWeight in 5187, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9347134 = idf(docFreq=116, maxDocs=44218)
              0.5 = fieldNorm(doc=5187)
        3.2568877 = weight(author_txt:sparck in 5187) [ClassicSimilarity], result of:
          3.2568877 = score(doc=5187,freq=1.0), product of:
            0.80556524 = queryWeight, product of:
              1.1660135 = boost
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.085440755 = queryNorm
            4.0429845 = fieldWeight in 5187, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.085969 = idf(docFreq=36, maxDocs=44218)
              0.5 = fieldNorm(doc=5187)
    

Similar documents (content)

  1. Endres-Niggemeyer, B.: Summarising text for intelligent communication : results of the Dagstuhl seminar (1994) 0.27
    0.2691209 = sum of:
      0.2691209 = product of:
        1.6820056 = sum of:
          0.00996907 = weight(abstract_txt:research in 8867) [ClassicSimilarity], result of:
            0.00996907 = score(doc=8867,freq=2.0), product of:
              0.028460598 = queryWeight, product of:
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.00897715 = queryNorm
              0.35027617 = fieldWeight in 8867, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.078125 = fieldNorm(doc=8867)
          0.086467326 = weight(abstract_txt:summarisation in 8867) [ClassicSimilarity], result of:
            0.086467326 = score(doc=8867,freq=1.0), product of:
              0.120145895 = queryWeight, product of:
                1.4528389 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.00897715 = queryNorm
              0.71968603 = fieldWeight in 8867, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=8867)
          0.013179375 = weight(abstract_txt:systems in 8867) [ClassicSimilarity], result of:
            0.013179375 = score(doc=8867,freq=1.0), product of:
              0.049443733 = queryWeight, product of:
                1.6142814 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.00897715 = queryNorm
              0.26655298 = fieldWeight in 8867, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.078125 = fieldNorm(doc=8867)
          1.5723898 = weight(abstract_txt:summarising in 8867) [ClassicSimilarity], result of:
            1.5723898 = score(doc=8867,freq=6.0), product of:
              0.8746414 = queryWeight, product of:
                10.371153 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.00897715 = queryNorm
              1.7977537 = fieldWeight in 8867, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.078125 = fieldNorm(doc=8867)
        0.16 = coord(4/25)
    
  2. Macgregor, G.; McCulloch, E.: Collaborative tagging as a knowledge organisation and resource discovery tool (2006) 0.13
    0.12731123 = sum of:
      0.12731123 = product of:
        0.53046346 = sum of:
          0.008546695 = weight(abstract_txt:research in 764) [ClassicSimilarity], result of:
            0.008546695 = score(doc=764,freq=3.0), product of:
              0.028460598 = queryWeight, product of:
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.00897715 = queryNorm
              0.30029923 = fieldWeight in 764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
          0.02517754 = weight(abstract_txt:engage in 764) [ClassicSimilarity], result of:
            0.02517754 = score(doc=764,freq=1.0), product of:
              0.06695008 = queryWeight, product of:
                1.0845225 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.00897715 = queryNorm
              0.37606436 = fieldWeight in 764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
          0.0111810975 = weight(abstract_txt:paper in 764) [ClassicSimilarity], result of:
            0.0111810975 = score(doc=764,freq=3.0), product of:
              0.034043547 = queryWeight, product of:
                1.0936929 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.00897715 = queryNorm
              0.32843515 = fieldWeight in 764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
          0.017759172 = weight(abstract_txt:review in 764) [ClassicSimilarity], result of:
            0.017759172 = score(doc=764,freq=1.0), product of:
              0.066839635 = queryWeight, product of:
                1.5324807 = boost
                4.858482 = idf(docFreq=932, maxDocs=44218)
                0.00897715 = queryNorm
              0.26569822 = fieldWeight in 764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.858482 = idf(docFreq=932, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
          0.018451124 = weight(abstract_txt:systems in 764) [ClassicSimilarity], result of:
            0.018451124 = score(doc=764,freq=4.0), product of:
              0.049443733 = queryWeight, product of:
                1.6142814 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.00897715 = queryNorm
              0.3731742 = fieldWeight in 764, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
          0.44934782 = weight(abstract_txt:summarising in 764) [ClassicSimilarity], result of:
            0.44934782 = score(doc=764,freq=1.0), product of:
              0.8746414 = queryWeight, product of:
                10.371153 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.00897715 = queryNorm
              0.5137509 = fieldWeight in 764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0546875 = fieldNorm(doc=764)
        0.24 = coord(6/25)
    
  3. Liang, S.-F.; Devlin, S.; Tait, J.: Investigating sentence weighting components for automatic summarisation (2007) 0.09
    0.086136766 = sum of:
      0.086136766 = product of:
        0.7178064 = sum of:
          0.009222014 = weight(abstract_txt:paper in 899) [ClassicSimilarity], result of:
            0.009222014 = score(doc=899,freq=1.0), product of:
              0.034043547 = queryWeight, product of:
                1.0936929 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.00897715 = queryNorm
              0.27088875 = fieldWeight in 899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.078125 = fieldNorm(doc=899)
          0.066658944 = weight(abstract_txt:summaries in 899) [ClassicSimilarity], result of:
            0.066658944 = score(doc=899,freq=3.0), product of:
              0.07003893 = queryWeight, product of:
                1.1092584 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.00897715 = queryNorm
              0.95174134 = fieldWeight in 899, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.078125 = fieldNorm(doc=899)
          0.64192545 = weight(abstract_txt:summarising in 899) [ClassicSimilarity], result of:
            0.64192545 = score(doc=899,freq=1.0), product of:
              0.8746414 = queryWeight, product of:
                10.371153 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.00897715 = queryNorm
              0.7339299 = fieldWeight in 899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.078125 = fieldNorm(doc=899)
        0.12 = coord(3/25)
    
  4. Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.08
    0.08154213 = sum of:
      0.08154213 = product of:
        0.33975887 = sum of:
          0.007377611 = weight(abstract_txt:paper in 948) [ClassicSimilarity], result of:
            0.007377611 = score(doc=948,freq=1.0), product of:
              0.034043547 = queryWeight, product of:
                1.0936929 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.00897715 = queryNorm
              0.216711 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.053327154 = weight(abstract_txt:summaries in 948) [ClassicSimilarity], result of:
            0.053327154 = score(doc=948,freq=3.0), product of:
              0.07003893 = queryWeight, product of:
                1.1092584 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.00897715 = queryNorm
              0.7613931 = fieldWeight in 948, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.014910761 = weight(abstract_txt:systems in 948) [ClassicSimilarity], result of:
            0.014910761 = score(doc=948,freq=2.0), product of:
              0.049443733 = queryWeight, product of:
                1.6142814 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.00897715 = queryNorm
              0.3015703 = fieldWeight in 948, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.052653313 = weight(abstract_txt:automatic in 948) [ClassicSimilarity], result of:
            0.052653313 = score(doc=948,freq=2.0), product of:
              0.11465558 = queryWeight, product of:
                2.4582226 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.00897715 = queryNorm
              0.45923027 = fieldWeight in 948, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.14230514 = weight(abstract_txt:extractive in 948) [ClassicSimilarity], result of:
            0.14230514 = score(doc=948,freq=1.0), product of:
              0.24485257 = queryWeight, product of:
                2.9331234 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.00897715 = queryNorm
              0.581187 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.0691849 = weight(abstract_txt:evaluation in 948) [ClassicSimilarity], result of:
            0.0691849 = score(doc=948,freq=3.0), product of:
              0.14246388 = queryWeight, product of:
                3.537532 = boost
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.00897715 = queryNorm
              0.48563117 = fieldWeight in 948, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
        0.24 = coord(6/25)
    
  5. Hirao, T.; Okumura, M.; Yasuda, N.; Isozaki, H.: Supervised automatic evaluation for summarization with voted regression model (2007) 0.07
    0.06690239 = sum of:
      0.06690239 = product of:
        0.33451194 = sum of:
          0.0544268 = weight(abstract_txt:summaries in 942) [ClassicSimilarity], result of:
            0.0544268 = score(doc=942,freq=2.0), product of:
              0.07003893 = queryWeight, product of:
                1.1092584 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.00897715 = queryNorm
              0.7770935 = fieldWeight in 942, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.078125 = fieldNorm(doc=942)
          0.011684914 = weight(abstract_txt:they in 942) [ClassicSimilarity], result of:
            0.011684914 = score(doc=942,freq=1.0), product of:
              0.039862815 = queryWeight, product of:
                1.1834829 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.00897715 = queryNorm
              0.29312816 = fieldWeight in 942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.078125 = fieldNorm(doc=942)
          0.013179375 = weight(abstract_txt:systems in 942) [ClassicSimilarity], result of:
            0.013179375 = score(doc=942,freq=1.0), product of:
              0.049443733 = queryWeight, product of:
                1.6142814 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.00897715 = queryNorm
              0.26655298 = fieldWeight in 942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.078125 = fieldNorm(doc=942)
          0.11399777 = weight(abstract_txt:automatic in 942) [ClassicSimilarity], result of:
            0.11399777 = score(doc=942,freq=6.0), product of:
              0.11465558 = queryWeight, product of:
                2.4582226 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.00897715 = queryNorm
              0.99426275 = fieldWeight in 942, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=942)
          0.14122309 = weight(abstract_txt:evaluation in 942) [ClassicSimilarity], result of:
            0.14122309 = score(doc=942,freq=8.0), product of:
              0.14246388 = queryWeight, product of:
                3.537532 = boost
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.00897715 = queryNorm
              0.9912905 = fieldWeight in 942, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.078125 = fieldNorm(doc=942)
        0.2 = coord(5/25)