Search (13 results, page 1 of 1)

Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.09
```
0.08835813 = product of:
  0.17671625 = sum of:
    0.17671625 = product of:
      0.3534325 = sum of:
        0.3534325 = weight(_text_:compression in 944) [ClassicSimilarity], result of:
          0.3534325 = score(doc=944,freq=6.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.97987294 = fieldWeight in 944, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0546875 = fieldNorm(doc=944)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.
Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.09
```
0.08745186 = product of:
  0.17490372 = sum of:
    0.17490372 = product of:
      0.34980744 = sum of:
        0.34980744 = weight(_text_:compression in 945) [ClassicSimilarity], result of:
          0.34980744 = score(doc=945,freq=8.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.96982265 = fieldWeight in 945, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].
Finegan-Dollak, C.; Radev, D.R.: Sentence simplification, compression, and disaggregation for summarization of sophisticated documents (2016) 0.07
```
0.07287655 = product of:
  0.1457531 = sum of:
    0.1457531 = product of:
      0.2915062 = sum of:
        0.2915062 = weight(_text_:compression in 3122) [ClassicSimilarity], result of:
          0.2915062 = score(doc=3122,freq=8.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.8081856 = fieldWeight in 3122, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3122)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Sophisticated documents like legal cases and biomedical articles can contain unusually long sentences. Extractive summarizers can select such sentences-potentially adding hundreds of unnecessary words to the summary-or exclude them and lose important content. Sentence simplification or compression seems on the surface to be a promising solution. However, compression removes words before the selection algorithm can use them, and simplification generates sentences that may be ambiguous in an extractive summary. We therefore compare the performance of an extractive summarizer selecting from the sentences of the original document with that of the summarizer selecting from sentences shortened in three ways: simplification, compression, and disaggregation, which splits one sentence into several according to rules designed to keep all meaning. We find that on legal cases and biomedical articles, these shortening methods generate ungrammatical output. Human evaluators performed an extrinsic evaluation consisting of comprehension questions about the summaries. Evaluators given compressed, simplified, or disaggregated versions of the summaries answered fewer questions correctly than did those given summaries with unaltered sentences. Error analysis suggests 2 causes: Altered sentences sometimes interact with the sentence selection algorithm, and alterations to sentences sometimes obscure information in the summary. We discuss future work to alleviate these problems.

Over, P.; Dang, H.; Harman, D.: DUC in context (2007) 0.06

0.05830124 = product of:
  0.11660248 = sum of:
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = weight(_text_:compression in 934) [ClassicSimilarity], result of:
          0.23320496 = score(doc=934,freq=2.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.64654845 = fieldWeight in 934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0625 = fieldNorm(doc=934)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype systems. Many factors can affect the design of such evaluations, requiring choices among competing alternatives. This paper examines several major themes running through three evaluations: SUMMAC, NTCIR, and DUC, with a concentration on DUC. The themes are extrinsic and intrinsic evaluation, evaluation procedures and methods, generic versus focused summaries, single- and multi-document summaries, length and compression issues, extracts versus abstracts, and issues with genre.

Yeh, J.-Y.; Ke, H.-R.; Yang, W.-P.; Meng, I.-H.: Text summarization using a trainable summarizer and latent semantic analysis (2005) 0.05
```
0.0515315 = product of:
  0.103063 = sum of:
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = weight(_text_:compression in 1003) [ClassicSimilarity], result of:
          0.206126 = score(doc=1003,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5714735 = fieldWeight in 1003, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1003)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper proposes two approaches to address text summarization: modified corpus-based approach (MCBA) and LSA-based T.R.M. approach (LSA + T.R.M.). The first is a trainable summarizer, which takes into account several features, including position, positive keyword, negative keyword, centrality, and the resemblance to the title, to generate summaries. Two new ideas are exploited: (1) sentence positions are ranked to emphasize the significances of different sentence positions, and (2) the score function is trained by the genetic algorithm (GA) to obtain a suitable combination of feature weights. The second uses latent semantic analysis (LSA) to derive the semantic matrix of a document or a corpus and uses semantic sentence representation to construct a semantic text relationship map. We evaluate LSA + T.R.M. both with single documents and at the corpus level to investigate the competence of LSA in text summarization. The two novel approaches were measured at several compression rates on a data corpus composed of 100 political articles. When the compression rate was 30%, an average f-measure of 49% for MCBA, 52% for MCBA + GA, 44% and 40% for LSA + T.R.M. in single-document and corpus level were achieved respectively.

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.053446062 = score(doc=6599,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 26. 2.1997 10:22:43

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
          0.053446062 = score(doc=6751,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 6751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6751)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
          0.053446062 = score(doc=6974,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 6974, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6974)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.01
```
0.010021136 = product of:
  0.020042272 = sum of:
    0.020042272 = product of:
      0.040084545 = sum of:
        0.040084545 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.040084545 = score(doc=948,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.01

0.008350947 = product of:
  0.016701894 = sum of:
    0.016701894 = product of:
      0.033403788 = sum of:
        0.033403788 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.033403788 = score(doc=5290,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 17:25:48

Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01

0.008350947 = product of:
  0.016701894 = sum of:
    0.016701894 = product of:
      0.033403788 = sum of:
        0.033403788 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
          0.033403788 = score(doc=2640,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.19345059 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2016 12:29:41

Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.01

0.008350947 = product of:
  0.016701894 = sum of:
    0.016701894 = product of:
      0.033403788 = sum of:
        0.033403788 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
          0.033403788 = score(doc=889,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.19345059 = fieldWeight in 889, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2023 18:57:12

Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01

0.008350947 = product of:
  0.016701894 = sum of:
    0.016701894 = product of:
      0.033403788 = sum of:
        0.033403788 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
          0.033403788 = score(doc=1012,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.19345059 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 6.2023 14:55:20

Search (13 results, page 1 of 1)

Authors

Years