Document (#35749)

Marcu, D.
Automatic abstracting and summarization
Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
London : Taylor & Francis
After lying dormant for a few decades, the field of automated text summarization has experienced a tremendous resurgence of interest. Recently, many new algorithms and techniques have been proposed for identifying important information in single documents and document collections, and for mapping this information into grammatical, cohesive, and coherent abstracts. Since 1997, annual workshops, conferences, and large-scale comparative evaluations have provided a rich environment for exchanging ideas between researchers in Asia, Europe, and North America. This entry reviews the main developments in the field and provides a guiding map to those interested in understanding the strengths and weaknesses of an increasingly ubiquitous technology.
Automatisches Abstracting

Similar documents (content)

  1. Pinfield, S.; Salter, J.; Bath, P.A.; Hubbard, B.; Millington, P.; Anders, J.H.S.; Hussain, A.: Open-access repositories worldwide, 2005-2012 : past growth, current characteristics, and future possibilities (2014) 0.11
    0.10953 = sum of:
      0.10953 = product of:
        0.456375 = sum of:
          0.025249494 = weight(abstract_txt:have in 1542) [ClassicSimilarity], result of:
            0.025249494 = score(doc=1542,freq=3.0), product of:
              0.07278435 = queryWeight, product of:
                1.0008777 = boost
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02269253 = queryNorm
              0.3469083 = fieldWeight in 1542, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
          0.08551989 = weight(abstract_txt:europe in 1542) [ClassicSimilarity], result of:
            0.08551989 = score(doc=1542,freq=2.0), product of:
              0.14914249 = queryWeight, product of:
                1.0130893 = boost
                6.487401 = idf(docFreq=182, maxDocs=44218)
                0.02269253 = queryNorm
              0.57341063 = fieldWeight in 1542, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.487401 = idf(docFreq=182, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
          0.06289492 = weight(abstract_txt:north in 1542) [ClassicSimilarity], result of:
            0.06289492 = score(doc=1542,freq=1.0), product of:
              0.15310064 = queryWeight, product of:
                1.0264447 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.02269253 = queryNorm
              0.4108077 = fieldWeight in 1542, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
          0.06412458 = weight(abstract_txt:experienced in 1542) [ClassicSimilarity], result of:
            0.06412458 = score(doc=1542,freq=1.0), product of:
              0.15508969 = queryWeight, product of:
                1.0330908 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.02269253 = queryNorm
              0.41346768 = fieldWeight in 1542, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
          0.10484607 = weight(abstract_txt:america in 1542) [ClassicSimilarity], result of:
            0.10484607 = score(doc=1542,freq=2.0), product of:
              0.17084073 = queryWeight, product of:
                1.0842832 = boost
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.02269253 = queryNorm
              0.6137065 = fieldWeight in 1542, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
          0.11374005 = weight(abstract_txt:asia in 1542) [ClassicSimilarity], result of:
            0.11374005 = score(doc=1542,freq=1.0), product of:
              0.22725262 = queryWeight, product of:
                1.2505512 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.02269253 = queryNorm
              0.5005005 = fieldWeight in 1542, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0625 = fieldNorm(doc=1542)
        0.24 = coord(6/25)
  2. Yang, C.C.; Wang, F.L.: Hierarchical summarization of large documents (2008) 0.09
    0.092099644 = sum of:
      0.092099644 = product of:
        0.5756228 = sum of:
          0.025249494 = weight(abstract_txt:have in 1719) [ClassicSimilarity], result of:
            0.025249494 = score(doc=1719,freq=3.0), product of:
              0.07278435 = queryWeight, product of:
                1.0008777 = boost
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02269253 = queryNorm
              0.3469083 = fieldWeight in 1719, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.0625 = fieldNorm(doc=1719)
          0.058712926 = weight(abstract_txt:decades in 1719) [ClassicSimilarity], result of:
            0.058712926 = score(doc=1719,freq=1.0), product of:
              0.1462365 = queryWeight, product of:
                1.0031708 = boost
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.02269253 = queryNorm
              0.40149298 = fieldWeight in 1719, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.0625 = fieldNorm(doc=1719)
          0.06640348 = weight(abstract_txt:evaluations in 1719) [ClassicSimilarity], result of:
            0.06640348 = score(doc=1719,freq=1.0), product of:
              0.15874273 = queryWeight, product of:
                1.0451869 = boost
                6.6929407 = idf(docFreq=148, maxDocs=44218)
                0.02269253 = queryNorm
              0.4183088 = fieldWeight in 1719, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6929407 = idf(docFreq=148, maxDocs=44218)
                0.0625 = fieldNorm(doc=1719)
          0.42525688 = weight(abstract_txt:summarization in 1719) [ClassicSimilarity], result of:
            0.42525688 = score(doc=1719,freq=7.0), product of:
              0.3605605 = queryWeight, product of:
                2.2276714 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.02269253 = queryNorm
              1.1794327 = fieldWeight in 1719, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=1719)
        0.16 = coord(4/25)
  3. Rader, H.B.: Information literacy 1973-2002 : a selected literature review (2002) 0.09
    0.08859793 = sum of:
      0.08859793 = product of:
        0.36915806 = sum of:
          0.02852234 = weight(abstract_txt:have in 43) [ClassicSimilarity], result of:
            0.02852234 = score(doc=43,freq=5.0), product of:
              0.07278435 = queryWeight, product of:
                1.0008777 = boost
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02269253 = queryNorm
              0.39187464 = fieldWeight in 43, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
          0.07265354 = weight(abstract_txt:decades in 43) [ClassicSimilarity], result of:
            0.07265354 = score(doc=43,freq=2.0), product of:
              0.1462365 = queryWeight, product of:
                1.0031708 = boost
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.02269253 = queryNorm
              0.49682224 = fieldWeight in 43, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
          0.061112043 = weight(abstract_txt:annual in 43) [ClassicSimilarity], result of:
            0.061112043 = score(doc=43,freq=1.0), product of:
              0.16417705 = queryWeight, product of:
                1.0629265 = boost
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.02269253 = queryNorm
              0.37223256 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
          0.06487019 = weight(abstract_txt:america in 43) [ClassicSimilarity], result of:
            0.06487019 = score(doc=43,freq=1.0), product of:
              0.17084073 = queryWeight, product of:
                1.0842832 = boost
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.02269253 = queryNorm
              0.37971154 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
          0.10686852 = weight(abstract_txt:tremendous in 43) [ClassicSimilarity], result of:
            0.10686852 = score(doc=43,freq=1.0), product of:
              0.23830205 = queryWeight, product of:
                1.2805924 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.02269253 = queryNorm
              0.44845825 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
          0.035131417 = weight(abstract_txt:field in 43) [ClassicSimilarity], result of:
            0.035131417 = score(doc=43,freq=1.0), product of:
              0.14301065 = queryWeight, product of:
                1.402963 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.02269253 = queryNorm
              0.24565597 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.0546875 = fieldNorm(doc=43)
        0.24 = coord(6/25)
  4. Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.08
    0.08128943 = sum of:
      0.08128943 = product of:
        0.40644717 = sum of:
          0.018039111 = weight(abstract_txt:have in 6068) [ClassicSimilarity], result of:
            0.018039111 = score(doc=6068,freq=2.0), product of:
              0.07278435 = queryWeight, product of:
                1.0008777 = boost
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02269253 = queryNorm
              0.24784327 = fieldWeight in 6068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.07032025 = weight(abstract_txt:coherent in 6068) [ClassicSimilarity], result of:
            0.07032025 = score(doc=6068,freq=1.0), product of:
              0.18028025 = queryWeight, product of:
                1.1138357 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.02269253 = queryNorm
              0.39006072 = fieldWeight in 6068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.084060706 = weight(abstract_txt:ubiquitous in 6068) [ClassicSimilarity], result of:
            0.084060706 = score(doc=6068,freq=1.0), product of:
              0.20305948 = queryWeight, product of:
                1.1821121 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.02269253 = queryNorm
              0.41397086 = fieldWeight in 6068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.035131417 = weight(abstract_txt:field in 6068) [ClassicSimilarity], result of:
            0.035131417 = score(doc=6068,freq=1.0), product of:
              0.14301065 = queryWeight, product of:
                1.402963 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.02269253 = queryNorm
              0.24565597 = fieldWeight in 6068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.1988957 = weight(abstract_txt:summarization in 6068) [ClassicSimilarity], result of:
            0.1988957 = score(doc=6068,freq=2.0), product of:
              0.3605605 = queryWeight, product of:
                2.2276714 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.02269253 = queryNorm
              0.5516292 = fieldWeight in 6068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
        0.2 = coord(5/25)
  5. Hjoerland, B.; Hartel, J.: Introduction to a Special Issue of Knowledge Organization (2003) 0.07
    0.07172282 = sum of:
      0.07172282 = product of:
        0.22413382 = sum of:
          0.012755578 = weight(abstract_txt:have in 3013) [ClassicSimilarity], result of:
            0.012755578 = score(doc=3013,freq=4.0), product of:
              0.07278435 = queryWeight, product of:
                1.0008777 = boost
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02269253 = queryNorm
              0.17525166 = fieldWeight in 3013, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.2046018 = idf(docFreq=4876, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.026456367 = weight(abstract_txt:europe in 3013) [ClassicSimilarity], result of:
            0.026456367 = score(doc=3013,freq=1.0), product of:
              0.14914249 = queryWeight, product of:
                1.0130893 = boost
                6.487401 = idf(docFreq=182, maxDocs=44218)
                0.02269253 = queryNorm
              0.17738988 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.487401 = idf(docFreq=182, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.027516529 = weight(abstract_txt:north in 3013) [ClassicSimilarity], result of:
            0.027516529 = score(doc=3013,freq=1.0), product of:
              0.15310064 = queryWeight, product of:
                1.0264447 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.02269253 = queryNorm
              0.17972837 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.029317193 = weight(abstract_txt:strengths in 3013) [ClassicSimilarity], result of:
            0.029317193 = score(doc=3013,freq=1.0), product of:
              0.15970904 = queryWeight, product of:
                1.0483632 = boost
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.02269253 = queryNorm
              0.18356627 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.029965613 = weight(abstract_txt:weaknesses in 3013) [ClassicSimilarity], result of:
            0.029965613 = score(doc=3013,freq=1.0), product of:
              0.16205534 = queryWeight, product of:
                1.056036 = boost
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.02269253 = queryNorm
              0.18490975 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.030556021 = weight(abstract_txt:annual in 3013) [ClassicSimilarity], result of:
            0.030556021 = score(doc=3013,freq=1.0), product of:
              0.16417705 = queryWeight, product of:
                1.0629265 = boost
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.02269253 = queryNorm
              0.18611628 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.032435097 = weight(abstract_txt:america in 3013) [ClassicSimilarity], result of:
            0.032435097 = score(doc=3013,freq=1.0), product of:
              0.17084073 = queryWeight, product of:
                1.0842832 = boost
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.02269253 = queryNorm
              0.18985577 = fieldWeight in 3013, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
          0.035131417 = weight(abstract_txt:field in 3013) [ClassicSimilarity], result of:
            0.035131417 = score(doc=3013,freq=4.0), product of:
              0.14301065 = queryWeight, product of:
                1.402963 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.02269253 = queryNorm
              0.24565597 = fieldWeight in 3013, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3013)
        0.32 = coord(8/25)