Search (114 results, page 1 of 6)

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.06

0.056765005 = product of:
  0.11353001 = sum of:
    0.11353001 = sum of:
      0.013081744 = weight(_text_:a in 6751) [ClassicSimilarity], result of:
        0.013081744 = score(doc=6751,freq=12.0), product of:
          0.05240202 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04544656 = queryNorm
          0.24964198 = fieldWeight in 6751, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0625 = fieldNorm(doc=6751)
      0.051189214 = weight(_text_:k in 6751) [ClassicSimilarity], result of:
        0.051189214 = score(doc=6751,freq=2.0), product of:
          0.16223413 = queryWeight, product of:
            3.569778 = idf(docFreq=3384, maxDocs=44218)
            0.04544656 = queryNorm
          0.31552678 = fieldWeight in 6751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.569778 = idf(docFreq=3384, maxDocs=44218)
            0.0625 = fieldNorm(doc=6751)
      0.049259055 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
        0.049259055 = score(doc=6751,freq=2.0), product of:
          0.15914612 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04544656 = queryNorm
          0.30952093 = fieldWeight in 6751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=6751)
  0.5 = coord(1/2)

Abstract: Presents a system for summarizing quantitative data in natural language, focusing on the use of a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate the approach. Initial corpus analysis revealed characteristics of textual summaries that challenge the capabilities of current language generation systems. A revision based corpus analysis was used to identify and encode the revision rules of the system. Presents a quantitative evaluation, using several test corpora, to measure the robustness of the new revision based model
Date: 6. 3.1997 16:22:15
Type: a

Automatic summarizing : introduction (1995) 0.04

0.03664373 = product of:
  0.07328746 = sum of:
    0.07328746 = product of:
      0.10993118 = sum of:
        0.0075527485 = weight(_text_:a in 626) [ClassicSimilarity], result of:
          0.0075527485 = score(doc=626,freq=4.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.14413087 = fieldWeight in 626, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=626)
        0.10237843 = weight(_text_:k in 626) [ClassicSimilarity], result of:
          0.10237843 = score(doc=626,freq=8.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.63105357 = fieldWeight in 626, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=626)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Content: Enthält u.a. Beiträge von: J. BATEMAN u. E. TEICH; R. BRANDOW, K. MITZE u. L.F. RAU; B. ENDRES-NIGGEMEYER, E. MAIER u. A. SIGEL; M.T. MAYBURY; K. McKEOWN, J. ROBIN u. K. KUKICH; A. ROTHKEGEL
Editor: Sparck Jones, K. u. B. Endres-Niggemeyer

McKeown, K.; Robin, J.; Kukich, K.: Generating concise natural language summaries (1995) 0.03

0.032388784 = product of:
  0.06477757 = sum of:
    0.06477757 = product of:
      0.09716635 = sum of:
        0.006675749 = weight(_text_:a in 2932) [ClassicSimilarity], result of:
          0.006675749 = score(doc=2932,freq=2.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.12739488 = fieldWeight in 2932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=2932)
        0.0904906 = weight(_text_:k in 2932) [ClassicSimilarity], result of:
          0.0904906 = score(doc=2932,freq=4.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.5577778 = fieldWeight in 2932, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.078125 = fieldNorm(doc=2932)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Type: a

Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.02

0.024475817 = product of:
  0.048951633 = sum of:
    0.048951633 = product of:
      0.073427446 = sum of:
        0.009440936 = weight(_text_:a in 1949) [ClassicSimilarity], result of:
          0.009440936 = score(doc=1949,freq=4.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.18016359 = fieldWeight in 1949, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
        0.06398651 = weight(_text_:k in 1949) [ClassicSimilarity], result of:
          0.06398651 = score(doc=1949,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.39440846 = fieldWeight in 1949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.478-483.
Type: a

Marsh, E.: ¬A production rule system for message summarisation (1984) 0.02

0.024475817 = product of:
  0.048951633 = sum of:
    0.048951633 = product of:
      0.073427446 = sum of:
        0.009440936 = weight(_text_:a in 1956) [ClassicSimilarity], result of:
          0.009440936 = score(doc=1956,freq=4.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.18016359 = fieldWeight in 1956, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=1956)
        0.06398651 = weight(_text_:k in 1956) [ClassicSimilarity], result of:
          0.06398651 = score(doc=1956,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.39440846 = fieldWeight in 1956, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.078125 = fieldNorm(doc=1956)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.534-537.
Type: a

Johnson, F.C.; Paice, C.D.; Black, W.J.; Neal, A.P.: ¬The application of linguistic processing to automatic abstract generation (1993) 0.02

0.023554087 = product of:
  0.047108173 = sum of:
    0.047108173 = product of:
      0.07066226 = sum of:
        0.006675749 = weight(_text_:a in 2290) [ClassicSimilarity], result of:
          0.006675749 = score(doc=2290,freq=2.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.12739488 = fieldWeight in 2290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=2290)
        0.06398651 = weight(_text_:k in 2290) [ClassicSimilarity], result of:
          0.06398651 = score(doc=2290,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.39440846 = fieldWeight in 2290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.078125 = fieldNorm(doc=2290)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.538-552.
Type: a

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.02

0.021454852 = product of:
  0.042909704 = sum of:
    0.042909704 = product of:
      0.06436455 = sum of:
        0.015105497 = weight(_text_:a in 6599) [ClassicSimilarity], result of:
          0.015105497 = score(doc=6599,freq=16.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.28826174 = fieldWeight in 6599, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
        0.049259055 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.049259055 = score(doc=6599,freq=2.0), product of:
            0.15914612 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544656 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: With the onset of the information explosion arising from digital libraries and access to a wealth of information through the Internet, the need to efficiently determine the relevance of a document becomes even more urgent. Describes a text extraction system (TES), which retrieves a set of sentences from a document to form an indicative abstract. Such an automated process enables information to be filtered more quickly. Discusses the combination of various text extraction techniques. Compares results with manually produced abstracts
Date: 26. 2.1997 10:22:43
Type: a

Brandow, R.; Mitze, K.; Rau, L.F.: Automatic condensation of electronic publications by sentence selection (1995) 0.02

0.020623472 = product of:
  0.041246943 = sum of:
    0.041246943 = product of:
      0.06187041 = sum of:
        0.010681199 = weight(_text_:a in 2929) [ClassicSimilarity], result of:
          0.010681199 = score(doc=2929,freq=8.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.20383182 = fieldWeight in 2929, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=2929)
        0.051189214 = weight(_text_:k in 2929) [ClassicSimilarity], result of:
          0.051189214 = score(doc=2929,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.31552678 = fieldWeight in 2929, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=2929)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: Description of a system that performs domain-independent automatic condensation of news from a large commercial news service encompassing 41 different publications. This system was evaluated against a system that condensed the same articles using only the first portions of the texts (the löead), up to the target length of the summaries. 3 lengths of articles were evaluated for 250 documents by both systems, totalling 1.500 suitability judgements in all. The lead-based summaries outperformed the 'intelligent' summaries significantly, achieving acceptability ratings of over 90%, compared to 74,7%
Type: a

Gomez, J.; Allen, K.; Matney, M.; Awopetu, T.; Shafer, S.: Experimenting with a machine generated annotations pipeline (2020) 0.02

0.020146469 = product of:
  0.040292937 = sum of:
    0.040292937 = product of:
      0.060439404 = sum of:
        0.009250191 = weight(_text_:a in 657) [ClassicSimilarity], result of:
          0.009250191 = score(doc=657,freq=6.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.17652355 = fieldWeight in 657, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=657)
        0.051189214 = weight(_text_:k in 657) [ClassicSimilarity], result of:
          0.051189214 = score(doc=657,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.31552678 = fieldWeight in 657, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=657)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: The UCLA Library reorganized its software developers into focused subteams with one, the Labs Team, dedicated to conducting experiments. In this article we describe our first attempt at conducting a software development experiment, in which we attempted to improve our digital library's search results with metadata from cloud-based image tagging services. We explore the findings and discuss the lessons learned from our first attempt at running an experiment.
Type: a

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.02

0.019980086 = product of:
  0.039960172 = sum of:
    0.039960172 = product of:
      0.059940256 = sum of:
        0.010681199 = weight(_text_:a in 6974) [ClassicSimilarity], result of:
          0.010681199 = score(doc=6974,freq=8.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.20383182 = fieldWeight in 6974, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=6974)
        0.049259055 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
          0.049259055 = score(doc=6974,freq=2.0), product of:
            0.15914612 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544656 = queryNorm
            0.30952093 = fieldWeight in 6974, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6974)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: Describes the application of weighting strategies to model uncertainties and probabilities in automatic abstracting systems, particularly in the concept selection phase. The weights were originally assigned in an ad hoc manner and were then refined by manual analysis of the results. The new method attempts to derive a more systematic methods and performs this using a genetic algorithm
Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
Type: a

Sparck Jones, K.; Endres-Niggemeyer, B.: Introduction: automatic summarizing (1995) 0.02

0.019580655 = product of:
  0.03916131 = sum of:
    0.03916131 = product of:
      0.058741964 = sum of:
        0.0075527485 = weight(_text_:a in 2931) [ClassicSimilarity], result of:
          0.0075527485 = score(doc=2931,freq=4.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.14413087 = fieldWeight in 2931, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=2931)
        0.051189214 = weight(_text_:k in 2931) [ClassicSimilarity], result of:
          0.051189214 = score(doc=2931,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.31552678 = fieldWeight in 2931, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=2931)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: Automatic summarizing is a research topic whose time has come. The papers illustrate some of the relevant work already under way. Places these papers in their wider context: why research and development on automatic summarizing is timely, what areas of work and ideas it should draw on, how future investigations and experiments can be effectively framed
Type: a

Ahmad, K.: Text summarisation : the role of lexical cohesion analysis (1995) 0.02

0.019580655 = product of:
  0.03916131 = sum of:
    0.03916131 = product of:
      0.058741964 = sum of:
        0.0075527485 = weight(_text_:a in 5795) [ClassicSimilarity], result of:
          0.0075527485 = score(doc=5795,freq=4.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.14413087 = fieldWeight in 5795, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=5795)
        0.051189214 = weight(_text_:k in 5795) [ClassicSimilarity], result of:
          0.051189214 = score(doc=5795,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.31552678 = fieldWeight in 5795, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=5795)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: The work in automatic text summary focuses mainly on computational models of texts. The artificial intelligence related work in text summary deals mainly with narrative texts such as newspaper reports and stories. Presents a study on the summary of non-narrative texts such as those in scientific and technical communication. Discusses syntactic cohesion; lexical cohesion; complex lexical repetition; simple and complex paraphrase; bonds and links; and Tele-pattan; an architecture for cohesion based text analysis and summarisation system working on SGML
Type: a

Steinberger, J.; Poesio, M.; Kabadjov, M.A.; Jezek, K.: Two uses of anaphora resolution in summarization (2007) 0.02

0.016802754 = product of:
  0.03360551 = sum of:
    0.03360551 = product of:
      0.05040826 = sum of:
        0.0120163495 = weight(_text_:a in 949) [ClassicSimilarity], result of:
          0.0120163495 = score(doc=949,freq=18.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.22931081 = fieldWeight in 949, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=949)
        0.03839191 = weight(_text_:k in 949) [ClassicSimilarity], result of:
          0.03839191 = score(doc=949,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.23664509 = fieldWeight in 949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=949)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: We propose a new method for using anaphoric information in Latent Semantic Analysis (lsa), and discuss its application to develop an lsa-based summarizer which achieves a significantly better performance than a system not using anaphoric information, and a better performance by the rouge measure than all but one of the single-document summarizers participating in DUC-2002. Anaphoric information is automatically extracted using a new release of our own anaphora resolution system, guitar, which incorporates proper noun resolution. Our summarizer also includes a new approach for automatically identifying the dimensionality reduction of a document on the basis of the desired summarization percentage. Anaphoric information is also used to check the coherence of the summary produced by our summarizer, by a reference checker module which identifies anaphoric resolution errors caused by sentence extraction.
Type: a

Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.02

0.016329778 = product of:
  0.032659557 = sum of:
    0.032659557 = product of:
      0.048989333 = sum of:
        0.010597425 = weight(_text_:a in 945) [ClassicSimilarity], result of:
          0.010597425 = score(doc=945,freq=14.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.20223314 = fieldWeight in 945, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
        0.03839191 = weight(_text_:k in 945) [ClassicSimilarity], result of:
          0.03839191 = score(doc=945,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.23664509 = fieldWeight in 945, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].
Type: a

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.02
```
0.01530025 = product of:
  0.0306005 = sum of:
    0.0306005 = product of:
      0.045900747 = sum of:
        0.008956458 = weight(_text_:a in 948) [ClassicSimilarity], result of:
          0.008956458 = score(doc=948,freq=10.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.1709182 = fieldWeight in 948, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
        0.03694429 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.03694429 = score(doc=948,freq=2.0), product of:
            0.15914612 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544656 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.

Type

a

Lam, W.; Chan, K.; Radev, D.; Saggion, H.; Teufel, S.: Context-based generic cross-lingual retrieval of documents and automated summaries (2005) 0.02

0.015109851 = product of:
  0.030219702 = sum of:
    0.030219702 = product of:
      0.045329552 = sum of:
        0.0069376426 = weight(_text_:a in 1965) [ClassicSimilarity], result of:
          0.0069376426 = score(doc=1965,freq=6.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.13239266 = fieldWeight in 1965, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=1965)
        0.03839191 = weight(_text_:k in 1965) [ClassicSimilarity], result of:
          0.03839191 = score(doc=1965,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.23664509 = fieldWeight in 1965, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=1965)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Abstract: We develop a context-based generic cross-lingual retrieval model that can deal with different language pairs. Our model considers contexts in the query translation process. Contexts in the query as weIl as in the documents based an co-occurrence statistics from different granularity of passages are exploited. We also investigate cross-lingual retrieval of automatic generic summaries. We have implemented our model for two different cross-lingual settings, namely, retrieving Chinese documents from English queries as weIl as retrieving English documents from Chinese queries. Extensive experiments have been conducted an a large-scale parallel corpus enabling studies an retrieval performance for two different cross-lingual settings of full-length documents as weIl as automated summaries.
Type: a

Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.01

0.014132454 = product of:
  0.028264908 = sum of:
    0.028264908 = product of:
      0.04239736 = sum of:
        0.00400545 = weight(_text_:a in 932) [ClassicSimilarity], result of:
          0.00400545 = score(doc=932,freq=2.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.07643694 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
        0.03839191 = weight(_text_:k in 932) [ClassicSimilarity], result of:
          0.03839191 = score(doc=932,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.23664509 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Type: a

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.01
```
0.013780732 = product of:
  0.027561463 = sum of:
    0.027561463 = product of:
      0.041342195 = sum of:
        0.010555287 = weight(_text_:a in 5290) [ClassicSimilarity], result of:
          0.010555287 = score(doc=5290,freq=20.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.20142901 = fieldWeight in 5290, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
        0.03078691 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.03078691 = score(doc=5290,freq=2.0), product of:
            0.15914612 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544656 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)
```
Abstract

Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.

Date

22. 7.2006 17:25:48

Type

a
Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.01
```
0.013389781 = product of:
  0.026779562 = sum of:
    0.026779562 = product of:
      0.040169343 = sum of:
        0.00817609 = weight(_text_:a in 2693) [ClassicSimilarity], result of:
          0.00817609 = score(doc=2693,freq=12.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.15602624 = fieldWeight in 2693, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2693)
        0.031993255 = weight(_text_:k in 2693) [ClassicSimilarity], result of:
          0.031993255 = score(doc=2693,freq=2.0), product of:
            0.16223413 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.04544656 = queryNorm
            0.19720423 = fieldWeight in 2693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2693)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)
```
Abstract

Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.

Type

a
Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.01
```
0.013206033 = product of:
  0.026412066 = sum of:
    0.026412066 = product of:
      0.039618097 = sum of:
        0.008831187 = weight(_text_:a in 889) [ClassicSimilarity], result of:
          0.008831187 = score(doc=889,freq=14.0), product of:
            0.05240202 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04544656 = queryNorm
            0.1685276 = fieldWeight in 889, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
        0.03078691 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
          0.03078691 = score(doc=889,freq=2.0), product of:
            0.15914612 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544656 = queryNorm
            0.19345059 = fieldWeight in 889, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)
```
Abstract

The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.

Date

22. 1.2023 18:57:12

Type

a

Search (114 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Subjects