Search (86 results, page 2 of 5)

  • × language_ss:"e"
  • × theme_ss:"Automatisches Abstracting"
  1. Sweeney, S.; Crestani, F.; Losada, D.E.: 'Show me more' : incremental length summarisation using novelty detection (2008) 0.00
    0.004935273 = product of:
      0.019741092 = sum of:
        0.019741092 = weight(_text_:information in 2054) [ClassicSimilarity], result of:
          0.019741092 = score(doc=2054,freq=12.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.23754507 = fieldWeight in 2054, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2054)
      0.25 = coord(1/4)
    
    Abstract
    The paper presents a study investigating the effects of incorporating novelty detection in automatic text summarisation. Condensing a textual document, automatic text summarisation can reduce the need to refer to the source document. It also offers a means to deliver device-friendly content when accessing information in non-traditional environments. An effective method of summarisation could be to produce a summary that includes only novel information. However, a consequence of focusing exclusively on novel parts may result in a loss of context, which may have an impact on the correct interpretation of the summary, with respect to the source document. In this study we compare two strategies to produce summaries that incorporate novelty in different ways: a constant length summary, which contains only novel sentences, and an incremental summary, containing additional sentences that provide context. The aim is to establish whether a summary that contains only novel sentences provides sufficient basis to determine relevance of a document, or if indeed we need to include additional sentences to provide context. Findings from the study seem to suggest that there is only a minimal difference in performance for the tasks we set our users and that the presence of contextual information is not so important. However, for the case of mobile information access, a summary that contains only novel information does offer benefits, given bandwidth constraints.
    Source
    Information processing and management. 44(2008) no.2, S.663-686
  2. Marcu, D.: Automatic abstracting and summarization (2009) 0.00
    0.0048856717 = product of:
      0.019542687 = sum of:
        0.019542687 = weight(_text_:information in 3748) [ClassicSimilarity], result of:
          0.019542687 = score(doc=3748,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.23515764 = fieldWeight in 3748, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3748)
      0.25 = coord(1/4)
    
    Abstract
    After lying dormant for a few decades, the field of automated text summarization has experienced a tremendous resurgence of interest. Recently, many new algorithms and techniques have been proposed for identifying important information in single documents and document collections, and for mapping this information into grammatical, cohesive, and coherent abstracts. Since 1997, annual workshops, conferences, and large-scale comparative evaluations have provided a rich environment for exchanging ideas between researchers in Asia, Europe, and North America. This entry reviews the main developments in the field and provides a guiding map to those interested in understanding the strengths and weaknesses of an increasingly ubiquitous technology.
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  3. Craven, T.C.: ¬An experiment in the use of tools for computer-assisted abstracting (1996) 0.00
    0.0048355605 = product of:
      0.019342242 = sum of:
        0.019342242 = weight(_text_:information in 7426) [ClassicSimilarity], result of:
          0.019342242 = score(doc=7426,freq=8.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.23274569 = fieldWeight in 7426, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=7426)
      0.25 = coord(1/4)
    
    Abstract
    Experimental subjects wrote abstracts of an article using a simplified version of the TEXNET abstracting assistance software. In addition to the fulltext, the 35 subjects were presented with either keywords or phrases extracted automatically. The resulting abstracts, and the times taken, were recorded automatically; some additional information was gathered by oral questionnaire. Results showed considerable variation among subjects, but 37% found the keywords or phrases quite or very useful in writing their abstracts. Statistical analysis failed to support deveral hypothesised relations; phrases were not viewed as significantly more helpful than keywords; and abstracting experience did not correlate with originality of wording, approximation of the author abstract, or greater conciseness. Results also suggested possible modifications to the software
    Imprint
    Medford, NJ : Learned Information
    Source
    Global complexity: information, chaos and control. Proceedings of the 59th Annual Meeting of the American Society for Information Science, ASIS'96, Baltimore, Maryland, 21-24 Oct 1996. Ed.: S. Hardin
  4. Johnson, F.C.: ¬A critical view of system-centered to user-centered evaluation of automatic abstracting research (1999) 0.00
    0.0048355605 = product of:
      0.019342242 = sum of:
        0.019342242 = weight(_text_:information in 2994) [ClassicSimilarity], result of:
          0.019342242 = score(doc=2994,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.23274569 = fieldWeight in 2994, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=2994)
      0.25 = coord(1/4)
    
    Source
    New review of information and library research. 5(1999), S.49-63
  5. Díaz, A.; Gervás, P.: User-model based personalized summarization (2007) 0.00
    0.0048355605 = product of:
      0.019342242 = sum of:
        0.019342242 = weight(_text_:information in 952) [ClassicSimilarity], result of:
          0.019342242 = score(doc=952,freq=8.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.23274569 = fieldWeight in 952, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=952)
      0.25 = coord(1/4)
    
    Abstract
    The potential of summary personalization is high, because a summary that would be useless to decide the relevance of a document if summarized in a generic manner, may be useful if the right sentences are selected that match the user interest. In this paper we defend the use of a personalized summarization facility to maximize the density of relevance of selections sent by a personalized information system to a given user. The personalization is applied to the digital newspaper domain and it used a user-model that stores long and short term interests using four reference systems: sections, categories, keywords and feedback terms. On the other side, it is crucial to measure how much information is lost during the summarization process, and how this information loss may affect the ability of the user to judge the relevance of a given document. The results obtained in two personalization systems show that personalized summaries perform better than generic and generic-personalized summaries in terms of identifying documents that satisfy user preferences. We also considered a user-centred direct evaluation that showed a high level of user satisfaction with the summaries.
    Source
    Information processing and management. 43(2007) no.6, S.1715-1734
  6. Liu, J.; Wu, Y.; Zhou, L.: ¬A hybrid method for abstracting newspaper articles (1999) 0.00
    0.00455901 = product of:
      0.01823604 = sum of:
        0.01823604 = weight(_text_:information in 4059) [ClassicSimilarity], result of:
          0.01823604 = score(doc=4059,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.21943474 = fieldWeight in 4059, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4059)
      0.25 = coord(1/4)
    
    Abstract
    This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
    Source
    Journal of the American Society for Information Science. 50(1999) no.13, S.1234-1245
  7. Rodríguez-Vidal, J.; Carrillo-de-Albornoz, J.; Gonzalo, J.; Plaza, L.: Authority and priority signals in automatic summary generation for online reputation management (2021) 0.00
    0.0045052674 = product of:
      0.01802107 = sum of:
        0.01802107 = weight(_text_:information in 213) [ClassicSimilarity], result of:
          0.01802107 = score(doc=213,freq=10.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.21684799 = fieldWeight in 213, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=213)
      0.25 = coord(1/4)
    
    Abstract
    Online reputation management (ORM) comprises the collection of techniques that help monitoring and improving the public image of an entity (companies, products, institutions) on the Internet. The ORM experts try to minimize the negative impact of the information about an entity while maximizing the positive material for being more trustworthy to the customers. Due to the huge amount of information that is published on the Internet every day, there is a need to summarize the entire flow of information to obtain only those data that are relevant to the entities. Traditionally the automatic summarization task in the ORM scenario takes some in-domain signals into account such as popularity, polarity for reputation and novelty but exists other feature to be considered, the authority of the people. This authority depends on the ability to convince others and therefore to influence opinions. In this work, we propose the use of authority signals that measures the influence of a user jointly with (a) priority signals related to the ORM domain and (b) information regarding the different topics that influential people is talking about. Our results indicate that the use of authority signals may significantly improve the quality of the summaries that are automatically generated.
    Source
    Journal of the Association for Information Science and Technology. 72(2021) no.5, S.583-594
  8. Endres-Niggemeyer, B.: Summarizing information (1998) 0.00
    0.0041877185 = product of:
      0.016750874 = sum of:
        0.016750874 = weight(_text_:information in 688) [ClassicSimilarity], result of:
          0.016750874 = score(doc=688,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.20156369 = fieldWeight in 688, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=688)
      0.25 = coord(1/4)
    
    Abstract
    Summarizing is the process of reducing the large information size of something like a novel or a scientific paper to a short summary or abstract comprising only the most essential points. Summarizing is frequent in everyday communication, but it is also a professional skill for journalists and others. Automated summarizing functions are urgently needed by Internet users who wish to avoid being overwhelmed by information. This book presents the state of the art and surveys related research; it deals with everyday and professional summarizing as well as computerized approaches. The author focuses in detail on the cognitive pro-cess involved in summarizing and supports this with a multimedia simulation systems on the accompanying CD-ROM
  9. Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.00
    0.0041877185 = product of:
      0.016750874 = sum of:
        0.016750874 = weight(_text_:information in 945) [ClassicSimilarity], result of:
          0.016750874 = score(doc=945,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.20156369 = fieldWeight in 945, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
      0.25 = coord(1/4)
    
    Abstract
    The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].
    Source
    Information processing and management. 43(2007) no.6, S.1571-1587
  10. Johnson, F.C.; Paice, C.D.; Black, W.J.; Neal, A.P.: ¬The application of linguistic processing to automatic abstract generation (1993) 0.00
    0.0040296335 = product of:
      0.016118534 = sum of:
        0.016118534 = weight(_text_:information in 2290) [ClassicSimilarity], result of:
          0.016118534 = score(doc=2290,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.19395474 = fieldWeight in 2290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=2290)
      0.25 = coord(1/4)
    
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.538-552.
  11. Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.00
    0.0040296335 = product of:
      0.016118534 = sum of:
        0.016118534 = weight(_text_:information in 1949) [ClassicSimilarity], result of:
          0.016118534 = score(doc=1949,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.19395474 = fieldWeight in 1949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.25 = coord(1/4)
    
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.478-483.
  12. Marsh, E.: ¬A production rule system for message summarisation (1984) 0.00
    0.0040296335 = product of:
      0.016118534 = sum of:
        0.016118534 = weight(_text_:information in 1956) [ClassicSimilarity], result of:
          0.016118534 = score(doc=1956,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.19395474 = fieldWeight in 1956, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1956)
      0.25 = coord(1/4)
    
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.534-537.
  13. Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.00
    0.0040296335 = product of:
      0.016118534 = sum of:
        0.016118534 = weight(_text_:information in 955) [ClassicSimilarity], result of:
          0.016118534 = score(doc=955,freq=8.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.19395474 = fieldWeight in 955, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=955)
      0.25 = coord(1/4)
    
    Abstract
    Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physician's evaluation of three randomly-selected papers from an evaluation corpus to show that the author's abstract does not always reflect the entire contents of the full-text.
    Source
    Information processing and management. 43(2007) no.6, S.1765-1776
  14. Yang, C.C.; Wang, F.L.: Hierarchical summarization of large documents (2008) 0.00
    0.0040296335 = product of:
      0.016118534 = sum of:
        0.016118534 = weight(_text_:information in 1719) [ClassicSimilarity], result of:
          0.016118534 = score(doc=1719,freq=8.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.19395474 = fieldWeight in 1719, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1719)
      0.25 = coord(1/4)
    
    Abstract
    Many automatic text summarization models have been developed in the last decades. Related research in information science has shown that human abstractors extract sentences for summaries based on the hierarchical structure of documents; however, the existing automatic summarization models do not take into account the human abstractor's behavior of sentence extraction and only consider the document as a sequence of sentences during the process of extraction of sentences as a summary. In general, a document exhibits a well-defined hierarchical structure that can be described as fractals - mathematical objects with a high degree of redundancy. In this article, we introduce the fractal summarization model based on the fractal theory. The important information is captured from the source document by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the source document is produced iteratively using the contractive transformation in the fractal theory. The fractal summarization model is the first attempt to apply fractal theory to document summarization. It significantly improves the divergence of information coverage of summary and the precision of summary. User evaluations have been conducted. Results have indicated that fractal summarization is promising and outperforms current summarization techniques that do not consider the hierarchical structure of documents.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.6, S.887-902
  15. Soricut, R.; Marcu, D.: Abstractive headline generation using WIDL-expressions (2007) 0.00
    0.003989134 = product of:
      0.015956536 = sum of:
        0.015956536 = weight(_text_:information in 943) [ClassicSimilarity], result of:
          0.015956536 = score(doc=943,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.1920054 = fieldWeight in 943, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=943)
      0.25 = coord(1/4)
    
    Abstract
    We present a new paradigm for the automatic creation of document headlines that is based on direct transformation of relevant textual information into well-formed textual output. Starting from an input document, we automatically create compact representations of weighted finite sets of strings, called WIDL-expressions, which encode the most important topics in the document. A generic natural language generation engine performs the headline generation task, driven by both statistical knowledge encapsulated in WIDL-expressions (representing topic biases induced by the input document) and statistical knowledge encapsulated in language models (representing biases induced by the target language). Our evaluation shows similar performance in quality with a state-of-the-art, extractive approach to headline generation, and significant improvements in quality over previously proposed solutions to abstractive headline generation.
    Source
    Information processing and management. 43(2007) no.6, S.1536-1548
  16. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.00
    0.003604214 = product of:
      0.014416856 = sum of:
        0.014416856 = weight(_text_:information in 947) [ClassicSimilarity], result of:
          0.014416856 = score(doc=947,freq=10.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.1734784 = fieldWeight in 947, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=947)
      0.25 = coord(1/4)
    
    Abstract
    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel integrated information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of methods in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) as measured by the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence "trimming" and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.
    Source
    Information processing and management. 43(2007) no.6, S.1588-1605
  17. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.00
    0.0034897653 = product of:
      0.013959061 = sum of:
        0.013959061 = weight(_text_:information in 657) [ClassicSimilarity], result of:
          0.013959061 = score(doc=657,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16796975 = fieldWeight in 657, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=657)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The purpose of this research is to develop a method for automatic construction of multi-document summaries of sets of news articles that might be retrieved by a web search engine in response to a user query. Design/methodology/approach - Based on the cross-document discourse analysis, an event-based framework is proposed for integrating and organizing information extracted from different news articles. It has a hierarchical structure in which the summarized information is presented at the top level and more detailed information given at the lower levels. A tree-view interface was implemented for displaying a multi-document summary based on the framework. A preliminary user evaluation was performed by comparing the framework-based summaries against the sentence-based summaries. Findings - In a small evaluation, all the human subjects preferred the framework-based summaries to the sentence-based summaries. It indicates that the event-based framework is an effective way to summarize a set of news articles reporting an event or a series of relevant events. Research limitations/implications - Limited to event-based news articles only, not applicable to news critiques and other kinds of news articles. A summarization system based on the event-based framework is being implemented. Practical implications - Multi-document summarization of news articles can adopt the proposed event-based framework. Originality/value - An event-based framework for summarizing sets of news articles was developed and evaluated using a tree-view interface for displaying such summaries.
  18. Yulianti, E.; Huspi, S.; Sanderson, M.: Tweet-biased summarization (2016) 0.00
    0.0034897653 = product of:
      0.013959061 = sum of:
        0.013959061 = weight(_text_:information in 2926) [ClassicSimilarity], result of:
          0.013959061 = score(doc=2926,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16796975 = fieldWeight in 2926, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2926)
      0.25 = coord(1/4)
    
    Abstract
    We examined whether the microblog comments given by people after reading a web document could be exploited to improve the accuracy of a web document summarization system. We examined the effect of social information (i.e., tweets) on the accuracy of the generated summaries by comparing the user preference for TBS (tweet-biased summary) with GS (generic summary). The result of crowdsourcing-based evaluation shows that the user preference for TBS was significantly higher than GS. We also took random samples of the documents to see the performance of summaries in a traditional evaluation using ROUGE, which, in general, TBS was also shown to be better than GS. We further analyzed the influence of the number of tweets pointed to a web document on summarization accuracy, finding a positive moderate correlation between the number of tweets pointed to a web document and the performance of generated TBS as measured by user preference. The results show that incorporating social information into the summary generation process can improve the accuracy of summary. The reason for people choosing one summary over another in a crowdsourcing-based evaluation is also presented in this article.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.6, S.1289-1300
  19. Atanassova, I.; Bertin, M.; Larivière, V.: On the composition of scientific abstracts (2016) 0.00
    0.0034897653 = product of:
      0.013959061 = sum of:
        0.013959061 = weight(_text_:information in 3028) [ClassicSimilarity], result of:
          0.013959061 = score(doc=3028,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16796975 = fieldWeight in 3028, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3028)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - Scientific abstracts reproduce only part of the information and the complexity of argumentation in a scientific article. The purpose of this paper provides a first analysis of the similarity between the text of scientific abstracts and the body of articles, using sentences as the basic textual unit. It contributes to the understanding of the structure of abstracts. Design/methodology/approach - Using sentence-based similarity metrics, the authors quantify the phenomenon of text re-use in abstracts and examine the positions of the sentences that are similar to sentences in abstracts in the introduction, methods, results and discussion structure, using a corpus of over 85,000 research articles published in the seven Public Library of Science journals. Findings - The authors provide evidence that 84 percent of abstract have at least one sentence in common with the body of the paper. Studying the distributions of sentences in the body of the articles that are re-used in abstracts, the authors show that there exists a strong relation between the rhetorical structure of articles and the zones that authors re-use when writing abstracts, with sentences mainly coming from the beginning of the introduction and the end of the conclusion. Originality/value - Scientific abstracts contain what is considered by the author(s) as information that best describe documents' content. This is a first study that examines the relation between the contents of abstracts and the rhetorical structure of scientific articles. The work might provide new insight for improving automatic abstracting tools as well as information retrieval approaches, in which text organization and structure are important features.
  20. Craven, T.C.: Abstracts produced using computer assistance (2000) 0.00
    0.0034192575 = product of:
      0.01367703 = sum of:
        0.01367703 = weight(_text_:information in 4809) [ClassicSimilarity], result of:
          0.01367703 = score(doc=4809,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16457605 = fieldWeight in 4809, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4809)
      0.25 = coord(1/4)
    
    Abstract
    Experimental subjects wrote abstracts using a simplified version of the TEXNET abstracting assistance software. In addition to the full text, subjects were presented with either keywords or phrases extracted automatically. The resulting abstracts, and the times taken, were recorded automatically; some additional information was gathered by oral questionnaire. Selected abstracts produced were evaluated on various criteria by independent raters. Results showed considerable variation among subjects, but 37% found the keywords or phrases 'quite' or 'very' useful in writing their abstracts. Statistical analysis failed to support several hypothesized relations: phrases were not viewed as significantly more helpful than keywords; and abstracting experience did not correlate with originality of wording, approximation of the author abstract, or greater conciseness. Requiring further study are some unanticipated strong correlations including the following: Windows experience and writing an abstract like the author's; experience reading abstracts and thinking one had written a good abstract; gender and abstract length; gender and use of words and phrases from the original text. Results have also suggested possible modifications to the TEXNET software
    Source
    Journal of the American Society for Information Science. 51(2000) no.8, S.745-756

Years

Types

  • a 84
  • el 1
  • m 1
  • s 1
  • More… Less…