Search (11 results, page 1 of 1)

Ou, S.; Khoo, S.G.; Goh, D.H.: Automatic multidocument summarization of research abstracts : design and user evaluation (2007) 0.01
```
0.0064210845 = product of:
  0.025684338 = sum of:
    0.025684338 = product of:
      0.051368676 = sum of:
        0.051368676 = weight(_text_:research in 522) [ClassicSimilarity], result of:
          0.051368676 = score(doc=522,freq=12.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.38605565 = fieldWeight in 522, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=522)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

The purpose of this study was to develop a method for automatic construction of multidocument summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts and relationships as well as research methods and contextual relations extracted from different dissertation abstracts. Based on the framework, a new summarization method was developed, which parses the discourse structure of abstracts, extracts research concepts and relationships, integrates the information across different abstracts, and organizes and presents them in a Web-based interface. The focus of this article is on the user evaluation that was performed to assess the overall quality and usefulness of the summaries. Two types of variable-based summaries generated using the summarization method-with or without the use of a taxonomy-were compared against a sentence-based summary that lists only the research-objective sentences extracted from each abstract and another sentence-based summary generated using the MEAD system that extracts important sentences. The evaluation results indicate that the majority of sociological researchers (70%) and general users (64%) preferred the variable-based summaries generated with the use of the taxonomy.
Moens, M.-F.: Summarizing court decisions (2007) 0.01
```
0.00635655 = product of:
  0.0254262 = sum of:
    0.0254262 = product of:
      0.0508524 = sum of:
        0.0508524 = weight(_text_:research in 954) [ClassicSimilarity], result of:
          0.0508524 = score(doc=954,freq=6.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.38217562 = fieldWeight in 954, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0546875 = fieldNorm(doc=954)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

In the field of law there is an absolute need for summarizing the texts of court decisions in order to make the content of the cases easily accessible for legal professionals. During the SALOMON and MOSAIC projects we investigated the summarization and retrieval of legal cases. This article presents some of the main findings while integrating the research results of experiments on legal document summarization by other research groups. In addition, we propose novel avenues of research for automatic text summarization, which we currently exploit when summarizing court decisions in the ACILA project. Techniques for automated concept learning and argument recognition are here the most challenging.
Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.00
```
0.004739205 = product of:
  0.01895682 = sum of:
    0.01895682 = product of:
      0.03791364 = sum of:
        0.03791364 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.03791364 = score(doc=948,freq=2.0), product of:
            0.16332182 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046639 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
Pinto, M.: Abstracting/abstract adaptation to digital environments : research trends (2003) 0.00
```
0.004448658 = product of:
  0.017794631 = sum of:
    0.017794631 = product of:
      0.035589263 = sum of:
        0.035589263 = weight(_text_:research in 4446) [ClassicSimilarity], result of:
          0.035589263 = score(doc=4446,freq=4.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.2674672 = fieldWeight in 4446, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046875 = fieldNorm(doc=4446)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

The technological revolution is affecting the structure, form and content of documents, reducing the effectiveness of traditional abstracts that, to some extent, are inadequate to the new documentary conditions. Aims to show the directions in which abstracting/abstracts can evolve to achieve the necessary adequacy in the new digital environments. Three researching trends are proposed: theoretical, methodological and pragmatic. Theoretically, there are some needs for expanding the document concept, reengineering abstracting and designing interdisciplinary models. Methodologically, the trend is toward the structuring, automating and qualifying of the abstracts. Pragmatically, abstracts networking, combined with alternative and complementary models, open a new and promising horizon. Automating, structuring and qualifying abstracting/abstract offer some short-term prospects for progress. Concludes that reengineering, networking and visualising would be middle-term fruitful areas of research toward the full adequacy of abstracting in the new electronic age.
Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.00
```
0.004448658 = product of:
  0.017794631 = sum of:
    0.017794631 = product of:
      0.035589263 = sum of:
        0.035589263 = weight(_text_:research in 932) [ClassicSimilarity], result of:
          0.035589263 = score(doc=932,freq=4.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.2674672 = fieldWeight in 932, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.00

0.0039493376 = product of:
  0.01579735 = sum of:
    0.01579735 = product of:
      0.0315947 = sum of:
        0.0315947 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.0315947 = score(doc=5290,freq=2.0), product of:
            0.16332182 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046639 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 7.2006 17:25:48

Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.00
```
0.003707215 = product of:
  0.01482886 = sum of:
    0.01482886 = product of:
      0.02965772 = sum of:
        0.02965772 = weight(_text_:research in 657) [ClassicSimilarity], result of:
          0.02965772 = score(doc=657,freq=4.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.22288933 = fieldWeight in 657, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=657)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Purpose - The purpose of this research is to develop a method for automatic construction of multi-document summaries of sets of news articles that might be retrieved by a web search engine in response to a user query. Design/methodology/approach - Based on the cross-document discourse analysis, an event-based framework is proposed for integrating and organizing information extracted from different news articles. It has a hierarchical structure in which the summarized information is presented at the top level and more detailed information given at the lower levels. A tree-view interface was implemented for displaying a multi-document summary based on the framework. A preliminary user evaluation was performed by comparing the framework-based summaries against the sentence-based summaries. Findings - In a small evaluation, all the human subjects preferred the framework-based summaries to the sentence-based summaries. It indicates that the event-based framework is an effective way to summarize a set of news articles reporting an event or a series of relevant events. Research limitations/implications - Limited to event-based news articles only, not applicable to news critiques and other kinds of news articles. A summarization system based on the event-based framework is being implemented. Practical implications - Multi-document summarization of news articles can adopt the proposed event-based framework. Originality/value - An event-based framework for summarizing sets of news articles was developed and evaluated using a tree-view interface for displaying such summaries.
Moens, M.F.; Dumortier, J.: Use of a text grammar for generating highlight abstracts of magazine articles (2000) 0.00
```
0.0036699555 = product of:
  0.014679822 = sum of:
    0.014679822 = product of:
      0.029359644 = sum of:
        0.029359644 = weight(_text_:research in 4540) [ClassicSimilarity], result of:
          0.029359644 = score(doc=4540,freq=2.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.22064918 = fieldWeight in 4540, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4540)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Browsing a database of article abstracts is one way to select and buy relevant magazine articles online. Our research contributes to the design and development of text grammars for abstracting texts in unlimited subject domains. We developed a system that parses texts based on the text grammar of a specific text type and that extracts sentences and statements which are relevant for inclusion in the abstracts. The system employs knowledge of the discourse patterns that are typical of news stories. The results are encouraging and demonstrate the importance of discourse structures in text summarisation.
Ling, X.; Jiang, J.; He, X.; Mei, Q.; Zhai, C.; Schatz, B.: Generating gene summaries from biomedical literature : a study of semi-structured summarization (2007) 0.00
```
0.0026213971 = product of:
  0.010485589 = sum of:
    0.010485589 = product of:
      0.020971177 = sum of:
        0.020971177 = weight(_text_:research in 946) [ClassicSimilarity], result of:
          0.020971177 = score(doc=946,freq=2.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.15760657 = fieldWeight in 946, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=946)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Most knowledge accumulated through scientific discoveries in genomics and related biomedical disciplines is buried in the vast amount of biomedical literature. Since understanding gene regulations is fundamental to biomedical research, summarizing all the existing knowledge about a gene based on literature is highly desirable to help biologists digest the literature. In this paper, we present a study of methods for automatically generating gene summaries from biomedical literature. Unlike most existing work on automatic text summarization, in which the generated summary is often a list of extracted sentences, we propose to generate a semi-structured summary which consists of sentences covering specific semantic aspects of a gene. Such a semi-structured summary is more appropriate for describing genes and poses special challenges for automatic text summarization. We propose a two-stage approach to generate such a summary for a given gene - first retrieving articles about a gene and then extracting sentences for each specified semantic aspect. We address the issue of gene name variation in the first stage and propose several different methods for sentence extraction in the second stage. We evaluate the proposed methods using a test set with 20 genes. Experiment results show that the proposed methods can generate useful semi-structured gene summaries automatically from biomedical literature, and our proposed methods outperform general purpose summarization methods. Among all the proposed methods for sentence extraction, a probabilistic language modeling approach that models gene context performs the best.
Yang, C.C.; Wang, F.L.: Hierarchical summarization of large documents (2008) 0.00
```
0.0026213971 = product of:
  0.010485589 = sum of:
    0.010485589 = product of:
      0.020971177 = sum of:
        0.020971177 = weight(_text_:research in 1719) [ClassicSimilarity], result of:
          0.020971177 = score(doc=1719,freq=2.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.15760657 = fieldWeight in 1719, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1719)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Many automatic text summarization models have been developed in the last decades. Related research in information science has shown that human abstractors extract sentences for summaries based on the hierarchical structure of documents; however, the existing automatic summarization models do not take into account the human abstractor's behavior of sentence extraction and only consider the document as a sequence of sentences during the process of extraction of sentences as a summary. In general, a document exhibits a well-defined hierarchical structure that can be described as fractals - mathematical objects with a high degree of redundancy. In this article, we introduce the fractal summarization model based on the fractal theory. The important information is captured from the source document by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the source document is produced iteratively using the contractive transformation in the fractal theory. The fractal summarization model is the first attempt to apply fractal theory to document summarization. It significantly improves the divergence of information coverage of summary and the precision of summary. User evaluations have been conducted. Results have indicated that fractal summarization is promising and outperforms current summarization techniques that do not consider the hierarchical structure of documents.
Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.00
```
0.0020971175 = product of:
  0.00838847 = sum of:
    0.00838847 = product of:
      0.01677694 = sum of:
        0.01677694 = weight(_text_:research in 947) [ClassicSimilarity], result of:
          0.01677694 = score(doc=947,freq=2.0), product of:
            0.13306029 = queryWeight, product of:
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.046639 = queryNorm
            0.12608525 = fieldWeight in 947, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.8529835 = idf(docFreq=6931, maxDocs=44218)
              0.03125 = fieldNorm(doc=947)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel integrated information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of methods in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) as measured by the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence "trimming" and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.

Search (11 results, page 1 of 1)

Authors

Themes