Search (33 results, page 1 of 2)

Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.03
```
0.02631331 = product of:
  0.06578328 = sum of:
    0.05012735 = weight(_text_:wide in 1012) [ClassicSimilarity], result of:
      0.05012735 = score(doc=1012,freq=2.0), product of:
        0.20479609 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046221454 = queryNorm
        0.24476713 = fieldWeight in 1012, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1012)
    0.015655924 = product of:
      0.031311847 = sum of:
        0.031311847 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
          0.031311847 = score(doc=1012,freq=2.0), product of:
            0.16185966 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046221454 = queryNorm
            0.19345059 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.

Date

22. 6.2023 14:55:20
Ou, S.; Khoo, S.G.; Goh, D.H.: Automatic multidocument summarization of research abstracts : design and user evaluation (2007) 0.02
```
0.02105975 = product of:
  0.05264937 = sum of:
    0.027194975 = weight(_text_:web in 522) [ClassicSimilarity], result of:
      0.027194975 = score(doc=522,freq=2.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.18028519 = fieldWeight in 522, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=522)
    0.025454395 = product of:
      0.05090879 = sum of:
        0.05090879 = weight(_text_:research in 522) [ClassicSimilarity], result of:
          0.05090879 = score(doc=522,freq=12.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.38605565 = fieldWeight in 522, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=522)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The purpose of this study was to develop a method for automatic construction of multidocument summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts and relationships as well as research methods and contextual relations extracted from different dissertation abstracts. Based on the framework, a new summarization method was developed, which parses the discourse structure of abstracts, extracts research concepts and relationships, integrates the information across different abstracts, and organizes and presents them in a Web-based interface. The focus of this article is on the user evaluation that was performed to assess the overall quality and usefulness of the summaries. Two types of variable-based summaries generated using the summarization method-with or without the use of a taxonomy-were compared against a sentence-based summary that lists only the research-objective sentences extracted from each abstract and another sentence-based summary generated using the MEAD system that extracts important sentences. The evaluation results indicate that the majority of sociological researchers (70%) and general users (64%) preferred the variable-based summaries generated with the use of the taxonomy.
Shen, D.; Yang, Q.; Chen, Z.: Noise reduction through summarization for Web-page classification (2007) 0.02
```
0.019580381 = product of:
  0.0979019 = sum of:
    0.0979019 = weight(_text_:web in 953) [ClassicSimilarity], result of:
      0.0979019 = score(doc=953,freq=18.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.64902663 = fieldWeight in 953, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=953)
  0.2 = coord(1/5)
```
Abstract

Due to a large variety of noisy information embedded in Web pages, Web-page classification is much more difficult than pure-text classification. In this paper, we propose to improve the Web-page classification performance by removing the noise through summarization techniques. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then put forward a new Web-page summarization algorithm based on Web-page layout and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that the classification algorithms (NB or SVM) augmented by any summarization approach can achieve an improvement by more than 5.0% as compared to pure-text-based classification algorithms. We further introduce an ensemble method to combine the different summarization algorithms. The ensemble summarization method achieves more than 12.0% improvement over pure-text based methods.

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.02

0.019425297 = product of:
  0.097126484 = sum of:
    0.097126484 = sum of:
      0.04702753 = weight(_text_:research in 6974) [ClassicSimilarity], result of:
        0.04702753 = score(doc=6974,freq=4.0), product of:
          0.13186905 = queryWeight, product of:
            2.8529835 = idf(docFreq=6931, maxDocs=44218)
            0.046221454 = queryNorm
          0.35662293 = fieldWeight in 6974, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            2.8529835 = idf(docFreq=6931, maxDocs=44218)
            0.0625 = fieldNorm(doc=6974)
      0.050098952 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
        0.050098952 = score(doc=6974,freq=2.0), product of:
          0.16185966 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046221454 = queryNorm
          0.30952093 = fieldWeight in 6974, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=6974)
  0.2 = coord(1/5)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Xu, D.; Cheng, G.; Qu, Y.: Preferences in Wikipedia abstracts : empirical findings and implications for automatic entity summarization (2014) 0.02

0.018041609 = product of:
  0.045104023 = sum of:
    0.032633968 = weight(_text_:web in 2700) [ClassicSimilarity], result of:
      0.032633968 = score(doc=2700,freq=2.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.21634221 = fieldWeight in 2700, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2700)
    0.012470056 = product of:
      0.024940113 = sum of:
        0.024940113 = weight(_text_:research in 2700) [ClassicSimilarity], result of:
          0.024940113 = score(doc=2700,freq=2.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.18912788 = fieldWeight in 2700, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046875 = fieldNorm(doc=2700)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The volume of entity-centric structured data grows rapidly on the Web. The description of an entity, composed of property-value pairs (a.k.a. features), has become very large in many applications. To avoid information overload, efforts have been made to automatically select a limited number of features to be shown to the user based on certain criteria, which is called automatic entity summarization. However, to the best of our knowledge, there is a lack of extensive studies on how humans rank and select features in practice, which can provide empirical support and inspire future research. In this article, we present a large-scale statistical analysis of the descriptions of entities provided by DBpedia and the abstracts of their corresponding Wikipedia articles, to empirically study, along several different dimensions, which kinds of features are preferable when humans summarize. Implications for automatic entity summarization are drawn from the findings.

Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.02
```
0.01675643 = product of:
  0.041891076 = sum of:
    0.027194975 = weight(_text_:web in 657) [ClassicSimilarity], result of:
      0.027194975 = score(doc=657,freq=2.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.18028519 = fieldWeight in 657, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=657)
    0.014696103 = product of:
      0.029392205 = sum of:
        0.029392205 = weight(_text_:research in 657) [ClassicSimilarity], result of:
          0.029392205 = score(doc=657,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.22288933 = fieldWeight in 657, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=657)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Purpose - The purpose of this research is to develop a method for automatic construction of multi-document summaries of sets of news articles that might be retrieved by a web search engine in response to a user query. Design/methodology/approach - Based on the cross-document discourse analysis, an event-based framework is proposed for integrating and organizing information extracted from different news articles. It has a hierarchical structure in which the summarized information is presented at the top level and more detailed information given at the lower levels. A tree-view interface was implemented for displaying a multi-document summary based on the framework. A preliminary user evaluation was performed by comparing the framework-based summaries against the sentence-based summaries. Findings - In a small evaluation, all the human subjects preferred the framework-based summaries to the sentence-based summaries. It indicates that the event-based framework is an effective way to summarize a set of news articles reporting an event or a series of relevant events. Research limitations/implications - Limited to event-based news articles only, not applicable to news critiques and other kinds of news articles. A summarization system based on the event-based framework is being implemented. Practical implications - Multi-document summarization of news articles can adopt the proposed event-based framework. Originality/value - An event-based framework for summarizing sets of news articles was developed and evaluated using a tree-view interface for displaying such summaries.
Yulianti, E.; Huspi, S.; Sanderson, M.: Tweet-biased summarization (2016) 0.01
```
0.01087799 = product of:
  0.05438995 = sum of:
    0.05438995 = weight(_text_:web in 2926) [ClassicSimilarity], result of:
      0.05438995 = score(doc=2926,freq=8.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.36057037 = fieldWeight in 2926, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2926)
  0.2 = coord(1/5)
```
Abstract

We examined whether the microblog comments given by people after reading a web document could be exploited to improve the accuracy of a web document summarization system. We examined the effect of social information (i.e., tweets) on the accuracy of the generated summaries by comparing the user preference for TBS (tweet-biased summary) with GS (generic summary). The result of crowdsourcing-based evaluation shows that the user preference for TBS was significantly higher than GS. We also took random samples of the documents to see the performance of summaries in a traditional evaluation using ROUGE, which, in general, TBS was also shown to be better than GS. We further analyzed the influence of the number of tweets pointed to a web document on summarization accuracy, finding a positive moderate correlation between the number of tweets pointed to a web document and the performance of generated TBS as measured by user preference. The results show that incorporating social information into the summary generation process can improve the accuracy of summary. The reason for people choosing one summary over another in a crowdsourcing-based evaluation is also presented in this article.
Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.01
```
0.010419055 = product of:
  0.052095275 = sum of:
    0.052095275 = sum of:
      0.020783428 = weight(_text_:research in 889) [ClassicSimilarity], result of:
        0.020783428 = score(doc=889,freq=2.0), product of:
          0.13186905 = queryWeight, product of:
            2.8529835 = idf(docFreq=6931, maxDocs=44218)
            0.046221454 = queryNorm
          0.15760657 = fieldWeight in 889, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.8529835 = idf(docFreq=6931, maxDocs=44218)
            0.0390625 = fieldNorm(doc=889)
      0.031311847 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
        0.031311847 = score(doc=889,freq=2.0), product of:
          0.16185966 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046221454 = queryNorm
          0.19345059 = fieldWeight in 889, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=889)
  0.2 = coord(1/5)
```
Abstract

The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.

Date

22. 1.2023 18:57:12

Johnson, F.C.: ¬A critical view of system-centered to user-centered evaluation of automatic abstracting research (1999) 0.01

0.007054129 = product of:
  0.035270646 = sum of:
    0.035270646 = product of:
      0.07054129 = sum of:
        0.07054129 = weight(_text_:research in 2994) [ClassicSimilarity], result of:
          0.07054129 = score(doc=2994,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.5349344 = fieldWeight in 2994, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.09375 = fieldNorm(doc=2994)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: New review of information and library research. 5(1999), S.49-63

Liang, S.-F.; Devlin, S.; Tait, J.: Investigating sentence weighting components for automatic summarisation (2007) 0.01
```
0.006526794 = product of:
  0.032633968 = sum of:
    0.032633968 = weight(_text_:web in 899) [ClassicSimilarity], result of:
      0.032633968 = score(doc=899,freq=2.0), product of:
        0.1508442 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046221454 = queryNorm
        0.21634221 = fieldWeight in 899, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=899)
  0.2 = coord(1/5)
```
Abstract

The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. It subsequently proved to be a reliable indicator for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.
Moens, M.-F.: Summarizing court decisions (2007) 0.01
```
0.005039714 = product of:
  0.025198568 = sum of:
    0.025198568 = product of:
      0.050397135 = sum of:
        0.050397135 = weight(_text_:research in 954) [ClassicSimilarity], result of:
          0.050397135 = score(doc=954,freq=6.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.38217562 = fieldWeight in 954, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0546875 = fieldNorm(doc=954)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

In the field of law there is an absolute need for summarizing the texts of court decisions in order to make the content of the cases easily accessible for legal professionals. During the SALOMON and MOSAIC projects we investigated the summarization and retrieval of legal cases. This article presents some of the main findings while integrating the research results of experiments on legal document summarization by other research groups. In addition, we propose novel avenues of research for automatic text summarization, which we currently exploit when summarizing court decisions in the ACILA project. Techniques for automated concept learning and argument recognition are here the most challenging.

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.01

0.005009895 = product of:
  0.025049476 = sum of:
    0.025049476 = product of:
      0.050098952 = sum of:
        0.050098952 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.050098952 = score(doc=6599,freq=2.0), product of:
            0.16185966 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046221454 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 26. 2.1997 10:22:43

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.01

0.005009895 = product of:
  0.025049476 = sum of:
    0.025049476 = product of:
      0.050098952 = sum of:
        0.050098952 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
          0.050098952 = score(doc=6751,freq=2.0), product of:
            0.16185966 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046221454 = queryNorm
            0.30952093 = fieldWeight in 6751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6751)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 6. 3.1997 16:22:15

Sparck Jones, K.; Endres-Niggemeyer, B.: Introduction: automatic summarizing (1995) 0.00

0.004702753 = product of:
  0.023513764 = sum of:
    0.023513764 = product of:
      0.04702753 = sum of:
        0.04702753 = weight(_text_:research in 2931) [ClassicSimilarity], result of:
          0.04702753 = score(doc=2931,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.35662293 = fieldWeight in 2931, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0625 = fieldNorm(doc=2931)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: Automatic summarizing is a research topic whose time has come. The papers illustrate some of the relevant work already under way. Places these papers in their wider context: why research and development on automatic summarizing is timely, what areas of work and ideas it should draw on, how future investigations and experiments can be effectively framed

Johnson, F.: Automatic abstracting research (1995) 0.00

0.004702753 = product of:
  0.023513764 = sum of:
    0.023513764 = product of:
      0.04702753 = sum of:
        0.04702753 = weight(_text_:research in 3847) [ClassicSimilarity], result of:
          0.04702753 = score(doc=3847,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.35662293 = fieldWeight in 3847, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0625 = fieldNorm(doc=3847)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: Discusses the attraction for researchers of the prospect of automatically generating abstracts but notes that the promise of superseding the human effort has yet to be realized. Notes ways in which progress in automatic abstracting research may come about and suggests a shift in the aim from reproducing the conventional benefits of abstracts to accentuating the advantages to users of the computerized representation of information in large textual databases

Su, H.: Automatic abstracting (1996) 0.00

0.004156686 = product of:
  0.020783428 = sum of:
    0.020783428 = product of:
      0.041566856 = sum of:
        0.041566856 = weight(_text_:research in 150) [ClassicSimilarity], result of:
          0.041566856 = score(doc=150,freq=2.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.31521314 = fieldWeight in 150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.078125 = fieldNorm(doc=150)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: Presents an introductory overview of research into the automatic construction of abstracts from the texts of documents. Discusses the origin and definition of automatic abstracting; reasons for using automatic abstracting; methods of automatic abstracting; and evaluation problems

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.00
```
0.0037574214 = product of:
  0.018787106 = sum of:
    0.018787106 = product of:
      0.037574213 = sum of:
        0.037574213 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.037574213 = score(doc=948,freq=2.0), product of:
            0.16185966 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046221454 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
Pinto, M.: Abstracting/abstract adaptation to digital environments : research trends (2003) 0.00
```
0.0035270646 = product of:
  0.017635323 = sum of:
    0.017635323 = product of:
      0.035270646 = sum of:
        0.035270646 = weight(_text_:research in 4446) [ClassicSimilarity], result of:
          0.035270646 = score(doc=4446,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.2674672 = fieldWeight in 4446, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046875 = fieldNorm(doc=4446)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

The technological revolution is affecting the structure, form and content of documents, reducing the effectiveness of traditional abstracts that, to some extent, are inadequate to the new documentary conditions. Aims to show the directions in which abstracting/abstracts can evolve to achieve the necessary adequacy in the new digital environments. Three researching trends are proposed: theoretical, methodological and pragmatic. Theoretically, there are some needs for expanding the document concept, reengineering abstracting and designing interdisciplinary models. Methodologically, the trend is toward the structuring, automating and qualifying of the abstracts. Pragmatically, abstracts networking, combined with alternative and complementary models, open a new and promising horizon. Automating, structuring and qualifying abstracting/abstract offer some short-term prospects for progress. Concludes that reengineering, networking and visualising would be middle-term fruitful areas of research toward the full adequacy of abstracting in the new electronic age.
Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.00
```
0.0035270646 = product of:
  0.017635323 = sum of:
    0.017635323 = product of:
      0.035270646 = sum of:
        0.035270646 = weight(_text_:research in 932) [ClassicSimilarity], result of:
          0.035270646 = score(doc=932,freq=4.0), product of:
            0.13186905 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046221454 = queryNorm
            0.2674672 = fieldWeight in 932, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.00

0.0031311847 = product of:
  0.015655924 = sum of:
    0.015655924 = product of:
      0.031311847 = sum of:
        0.031311847 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.031311847 = score(doc=5290,freq=2.0), product of:
            0.16185966 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046221454 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 22. 7.2006 17:25:48

Search (33 results, page 1 of 2)

Authors

Years

Languages

Types

Themes

Subjects