Search (54 results, page 1 of 3)

  • × theme_ss:"Automatisches Abstracting"
  1. Liang, S.-F.; Devlin, S.; Tait, J.: Investigating sentence weighting components for automatic summarisation (2007) 0.10
    0.10186547 = product of:
      0.20373094 = sum of:
        0.09581695 = weight(_text_:term in 899) [ClassicSimilarity], result of:
          0.09581695 = score(doc=899,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.4374403 = fieldWeight in 899, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=899)
        0.107914 = weight(_text_:frequency in 899) [ClassicSimilarity], result of:
          0.107914 = score(doc=899,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.39037234 = fieldWeight in 899, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.046875 = fieldNorm(doc=899)
      0.5 = coord(2/4)
    
    Abstract
    The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. It subsequently proved to be a reliable indicator for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.
  2. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.05
    0.050061207 = product of:
      0.100122415 = sum of:
        0.010194084 = product of:
          0.040776335 = sum of:
            0.040776335 = weight(_text_:based in 2693) [ClassicSimilarity], result of:
              0.040776335 = score(doc=2693,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28829288 = fieldWeight in 2693, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2693)
          0.25 = coord(1/4)
        0.08992833 = weight(_text_:frequency in 2693) [ClassicSimilarity], result of:
          0.08992833 = score(doc=2693,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.32531026 = fieldWeight in 2693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2693)
      0.5 = coord(2/4)
    
    Abstract
    Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.
  3. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.05
    0.047906943 = product of:
      0.095813885 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 6601) [ClassicSimilarity], result of:
              0.023542227 = score(doc=6601,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 6601, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6601)
          0.25 = coord(1/4)
        0.08992833 = weight(_text_:frequency in 6601) [ClassicSimilarity], result of:
          0.08992833 = score(doc=6601,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.32531026 = fieldWeight in 6601, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6601)
      0.5 = coord(2/4)
    
    Abstract
    Describes the design and implementation of a text extraction tool, NEWS_EXT, which aztomatically produces summaries from news reports by extracting sentences to form indicative abstracts. Selection of sentences is based on sentence importance, measured by means of sentence scoring or simple linguistic analysis of sentence structure. Tests were conducted on 4 approaches for the functioning of the NEWS_EXT system; extraction by keyword frequency; extraction by title keywords; extraction by location; and extraction by indicative phrase. Reports results of a study to compare the results of the application of NEWS_EXT with manually produced extracts; using relevance as the criterion for effectiveness. 48 newspaper articles were assessed (The Straits Times, International Herald Tribune, Asian Wall Street Journal, and Financial Times). The evaluation was conducted in 2 stages: stage 1 involving abstracts produced manually by 2 human experts; stage 2 involving the generation of abstracts using NEWS_EXT. Results of each of the 4 approaches were compared with the human produced abstracts, where the title and location approaches were found to give the best results for both local and foreign news. Reports plans to refine and enhance NEWS_EXT and incorporate it as a module within a larger newspaper clipping system
  4. Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.05
    0.047906943 = product of:
      0.095813885 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 955) [ClassicSimilarity], result of:
              0.023542227 = score(doc=955,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 955, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=955)
          0.25 = coord(1/4)
        0.08992833 = weight(_text_:frequency in 955) [ClassicSimilarity], result of:
          0.08992833 = score(doc=955,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.32531026 = fieldWeight in 955, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=955)
      0.5 = coord(2/4)
    
    Abstract
    Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physician's evaluation of three randomly-selected papers from an evaluation corpus to show that the author's abstract does not always reflect the entire contents of the full-text.
  5. Abdi, A.; Idris, N.; Alguliev, R.M.; Aliguliyev, R.M.: Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems (2015) 0.04
    0.04460577 = product of:
      0.08921154 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 2681) [ClassicSimilarity], result of:
              0.028250674 = score(doc=2681,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 2681, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2681)
          0.25 = coord(1/4)
        0.08214887 = product of:
          0.16429774 = sum of:
            0.16429774 = weight(_text_:assessment in 2681) [ClassicSimilarity], result of:
              0.16429774 = score(doc=2681,freq=6.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.63392264 = fieldWeight in 2681, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2681)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Summary writing is a process for creating a short version of a source text. It can be used as a measure of understanding. As grading students' summaries is a very time-consuming task, computer-assisted assessment can help teachers perform the grading more effectively. Several techniques, such as BLEU, ROUGE, N-gram co-occurrence, Latent Semantic Analysis (LSA), LSA_Ngram and LSA_ERB, have been proposed to support the automatic assessment of students' summaries. Since these techniques are more suitable for long texts, their performance is not satisfactory for the evaluation of short summaries. This paper proposes a specialized method that works well in assessing short summaries. Our proposed method integrates the semantic relations between words, and their syntactic composition. As a result, the proposed method is able to obtain high accuracy and improve the performance compared with the current techniques. Experiments have displayed that it is to be preferred over the existing techniques. A summary evaluation system based on the proposed method has also been developed.
  6. Díaz, A.; Gervás, P.: User-model based personalized summarization (2007) 0.04
    0.037407737 = product of:
      0.074815474 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 952) [ClassicSimilarity], result of:
              0.028250674 = score(doc=952,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 952, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=952)
          0.25 = coord(1/4)
        0.06775281 = weight(_text_:term in 952) [ClassicSimilarity], result of:
          0.06775281 = score(doc=952,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=952)
      0.5 = coord(2/4)
    
    Abstract
    The potential of summary personalization is high, because a summary that would be useless to decide the relevance of a document if summarized in a generic manner, may be useful if the right sentences are selected that match the user interest. In this paper we defend the use of a personalized summarization facility to maximize the density of relevance of selections sent by a personalized information system to a given user. The personalization is applied to the digital newspaper domain and it used a user-model that stores long and short term interests using four reference systems: sections, categories, keywords and feedback terms. On the other side, it is crucial to measure how much information is lost during the summarization process, and how this information loss may affect the ability of the user to judge the relevance of a given document. The results obtained in two personalization systems show that personalized summaries perform better than generic and generic-personalized summaries in terms of identifying documents that satisfy user preferences. We also considered a user-centred direct evaluation that showed a high level of user satisfaction with the summaries.
  7. Pinto, M.: Abstracting/abstract adaptation to digital environments : research trends (2003) 0.02
    0.023954237 = product of:
      0.09581695 = sum of:
        0.09581695 = weight(_text_:term in 4446) [ClassicSimilarity], result of:
          0.09581695 = score(doc=4446,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.4374403 = fieldWeight in 4446, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=4446)
      0.25 = coord(1/4)
    
    Abstract
    The technological revolution is affecting the structure, form and content of documents, reducing the effectiveness of traditional abstracts that, to some extent, are inadequate to the new documentary conditions. Aims to show the directions in which abstracting/abstracts can evolve to achieve the necessary adequacy in the new digital environments. Three researching trends are proposed: theoretical, methodological and pragmatic. Theoretically, there are some needs for expanding the document concept, reengineering abstracting and designing interdisciplinary models. Methodologically, the trend is toward the structuring, automating and qualifying of the abstracts. Pragmatically, abstracts networking, combined with alternative and complementary models, open a new and promising horizon. Automating, structuring and qualifying abstracting/abstract offer some short-term prospects for progress. Concludes that reengineering, networking and visualising would be middle-term fruitful areas of research toward the full adequacy of abstracting in the new electronic age.
  8. Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.02
    0.020875674 = product of:
      0.041751347 = sum of:
        0.016310534 = product of:
          0.065242134 = sum of:
            0.065242134 = weight(_text_:based in 6751) [ClassicSimilarity], result of:
              0.065242134 = score(doc=6751,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.4612686 = fieldWeight in 6751, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6751)
          0.25 = coord(1/4)
        0.025440816 = product of:
          0.05088163 = sum of:
            0.05088163 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
              0.05088163 = score(doc=6751,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30952093 = fieldWeight in 6751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6751)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Presents a system for summarizing quantitative data in natural language, focusing on the use of a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate the approach. Initial corpus analysis revealed characteristics of textual summaries that challenge the capabilities of current language generation systems. A revision based corpus analysis was used to identify and encode the revision rules of the system. Presents a quantitative evaluation, using several test corpora, to measure the robustness of the new revision based model
    Date
    6. 3.1997 16:22:15
  9. Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01
    0.013835812 = product of:
      0.027671624 = sum of:
        0.011771114 = product of:
          0.047084454 = sum of:
            0.047084454 = weight(_text_:based in 2640) [ClassicSimilarity], result of:
              0.047084454 = score(doc=2640,freq=8.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.33289194 = fieldWeight in 2640, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2640)
          0.25 = coord(1/4)
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
              0.031801023 = score(doc=2640,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 2640, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2640)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    We propose a tag-based framework that simulates human abstractors' ability to select significant sentences based on key concepts in a sentence as well as the semantic relations between key concepts to create generic summaries of transcribed lecture videos. The proposed extractive summarization method uses tags (viewer- and author-assigned terms) as key concepts. Our method employs Flickr tag clusters and WordNet synonyms to expand tags and detect the semantic relations between tags. This method helps select sentences that have a greater number of semantically related key concepts. To investigate the effectiveness and uniqueness of the proposed method, we compare it with an existing technique, latent semantic analysis (LSA), using intrinsic and extrinsic evaluations. The results of intrinsic evaluation show that the tag-based method is as or more effective than the LSA method. We also observe that in the extrinsic evaluation, the grand mean accuracy score of the tag-based method is higher than that of the LSA method, with a statistically significant difference. Elaborating on our results, we discuss the theoretical and practical implications of our findings for speech video summarization and retrieval.
    Date
    22. 1.2016 12:29:41
  10. Haag, M.: Automatic text summarization : Evaluation des Copernic Summarizer und mögliche Einsatzfelder in der Fachinformation der DaimlerCrysler AG (2002) 0.01
    0.011857167 = product of:
      0.047428668 = sum of:
        0.047428668 = product of:
          0.094857335 = sum of:
            0.094857335 = weight(_text_:assessment in 649) [ClassicSimilarity], result of:
              0.094857335 = score(doc=649,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.36599535 = fieldWeight in 649, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.046875 = fieldNorm(doc=649)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    An evaluation of the Copernic Summarizer, a software for automatically summarizing text in various data formats, is being presented. It shall be assessed if and how the Copernic Summarizer can reasonably be used in the DaimlerChrysler Information Division in order to enhance the quality of its information services. First, an introduction into Automatic Text Summarization is given and the Copernic Summarizer is being presented. Various methods for evaluating Automatic Text Summarization systems and software ergonomics are presented. Two evaluation forms are developed with which the employees of the Information Division shall evaluate the quality and relevance of the extracted keywords and summaries as well as the software's usability. The quality and relevance assessment is done by comparing the original text to the summaries. Finally, a recommendation is given concerning the use of the Copernic Summarizer.
  11. Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01
    0.010893034 = product of:
      0.021786068 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 1012) [ClassicSimilarity], result of:
              0.023542227 = score(doc=1012,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 1012, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1012)
          0.25 = coord(1/4)
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
              0.031801023 = score(doc=1012,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 1012, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1012)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.
    Date
    22. 6.2023 14:55:20
  12. Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.01
    0.006360204 = product of:
      0.025440816 = sum of:
        0.025440816 = product of:
          0.05088163 = sum of:
            0.05088163 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
              0.05088163 = score(doc=6599,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30952093 = fieldWeight in 6599, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6599)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    26. 2.1997 10:22:43
  13. Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.01
    0.006360204 = product of:
      0.025440816 = sum of:
        0.025440816 = product of:
          0.05088163 = sum of:
            0.05088163 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
              0.05088163 = score(doc=6974,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30952093 = fieldWeight in 6974, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6974)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  14. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.01
    0.0055054342 = product of:
      0.022021737 = sum of:
        0.022021737 = product of:
          0.08808695 = sum of:
            0.08808695 = weight(_text_:based in 657) [ClassicSimilarity], result of:
              0.08808695 = score(doc=657,freq=28.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.6227838 = fieldWeight in 657, product of:
                  5.2915025 = tf(freq=28.0), with freq of:
                    28.0 = termFreq=28.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=657)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The purpose of this research is to develop a method for automatic construction of multi-document summaries of sets of news articles that might be retrieved by a web search engine in response to a user query. Design/methodology/approach - Based on the cross-document discourse analysis, an event-based framework is proposed for integrating and organizing information extracted from different news articles. It has a hierarchical structure in which the summarized information is presented at the top level and more detailed information given at the lower levels. A tree-view interface was implemented for displaying a multi-document summary based on the framework. A preliminary user evaluation was performed by comparing the framework-based summaries against the sentence-based summaries. Findings - In a small evaluation, all the human subjects preferred the framework-based summaries to the sentence-based summaries. It indicates that the event-based framework is an effective way to summarize a set of news articles reporting an event or a series of relevant events. Research limitations/implications - Limited to event-based news articles only, not applicable to news critiques and other kinds of news articles. A summarization system based on the event-based framework is being implemented. Practical implications - Multi-document summarization of news articles can adopt the proposed event-based framework. Originality/value - An event-based framework for summarizing sets of news articles was developed and evaluated using a tree-view interface for displaying such summaries.
  15. Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.00
    0.0047701527 = product of:
      0.019080611 = sum of:
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
              0.038161222 = score(doc=948,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=948)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
  16. Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.00
    0.003975128 = product of:
      0.015900511 = sum of:
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
              0.031801023 = score(doc=5290,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 5290, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5290)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 7.2006 17:25:48
  17. Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.00
    0.003975128 = product of:
      0.015900511 = sum of:
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
              0.031801023 = score(doc=889,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 889, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=889)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 1.2023 18:57:12
  18. Ou, S.; Khoo, S.G.; Goh, D.H.: Automatic multidocument summarization of research abstracts : design and user evaluation (2007) 0.00
    0.00389293 = product of:
      0.01557172 = sum of:
        0.01557172 = product of:
          0.06228688 = sum of:
            0.06228688 = weight(_text_:based in 522) [ClassicSimilarity], result of:
              0.06228688 = score(doc=522,freq=14.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.44037464 = fieldWeight in 522, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=522)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    The purpose of this study was to develop a method for automatic construction of multidocument summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts and relationships as well as research methods and contextual relations extracted from different dissertation abstracts. Based on the framework, a new summarization method was developed, which parses the discourse structure of abstracts, extracts research concepts and relationships, integrates the information across different abstracts, and organizes and presents them in a Web-based interface. The focus of this article is on the user evaluation that was performed to assess the overall quality and usefulness of the summaries. Two types of variable-based summaries generated using the summarization method-with or without the use of a taxonomy-were compared against a sentence-based summary that lists only the research-objective sentences extracted from each abstract and another sentence-based summary generated using the MEAD system that extracts important sentences. The evaluation results indicate that the majority of sociological researchers (70%) and general users (64%) preferred the variable-based summaries generated with the use of the taxonomy.
  19. Xiong, S.; Ji, D.: Query-focused multi-document summarization using hypergraph-based ranking (2016) 0.00
    0.0035679291 = product of:
      0.014271717 = sum of:
        0.014271717 = product of:
          0.057086866 = sum of:
            0.057086866 = weight(_text_:based in 2972) [ClassicSimilarity], result of:
              0.057086866 = score(doc=2972,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.40361002 = fieldWeight in 2972, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2972)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    General graph random walk has been successfully applied in multi-document summarization, but it has some limitations to process documents by this way. In this paper, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distribution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.
  20. Hobson, S.P.; Dorr, B.J.; Monz, C.; Schwartz, R.: Task-based evaluation of text summarization using Relevance Prediction (2007) 0.00
    0.0035313342 = product of:
      0.014125337 = sum of:
        0.014125337 = product of:
          0.056501348 = sum of:
            0.056501348 = weight(_text_:based in 938) [ClassicSimilarity], result of:
              0.056501348 = score(doc=938,freq=8.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.39947033 = fieldWeight in 938, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=938)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    This article introduces a new task-based evaluation measure called Relevance Prediction that is a more intuitive measure of an individual's performance on a real-world task than interannotator agreement. Relevance Prediction parallels what a user does in the real world task of browsing a set of documents using standard search tools, i.e., the user judges relevance based on a short summary and then that same user - not an independent user - decides whether to open (and judge) the corresponding document. This measure is shown to be a more reliable measure of task performance than LDC Agreement, a current gold-standard based measure used in the summarization evaluation community. Our goal is to provide a stable framework within which developers of new automatic measures may make stronger statistical statements about the effectiveness of their measures in predicting summary usefulness. We demonstrate - as a proof-of-concept methodology for automatic metric developers - that a current automatic evaluation measure has a better correlation with Relevance Prediction than with LDC Agreement and that the significance level for detected differences is higher for the former than for the latter.

Years

Languages

  • e 52
  • chi 1
  • d 1
  • More… Less…

Types

  • a 53
  • el 1
  • m 1
  • More… Less…