Search (22 results, page 1 of 2)

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.06

0.05908383 = product of:
  0.11816766 = sum of:
    0.11816766 = sum of:
      0.061605897 = weight(_text_:systems in 6974) [ClassicSimilarity], result of:
        0.061605897 = score(doc=6974,freq=4.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.38414678 = fieldWeight in 6974, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0625 = fieldNorm(doc=6974)
      0.056561764 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
        0.056561764 = score(doc=6974,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.30952093 = fieldWeight in 6974, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=6974)
  0.5 = coord(1/2)

Abstract: Describes the application of weighting strategies to model uncertainties and probabilities in automatic abstracting systems, particularly in the concept selection phase. The weights were originally assigned in an ad hoc manner and were then refined by manual analysis of the results. The new method attempts to derive a more systematic methods and performs this using a genetic algorithm
Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.05

0.050061855 = product of:
  0.10012371 = sum of:
    0.10012371 = sum of:
      0.043561947 = weight(_text_:systems in 6751) [ClassicSimilarity], result of:
        0.043561947 = score(doc=6751,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.2716328 = fieldWeight in 6751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0625 = fieldNorm(doc=6751)
      0.056561764 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
        0.056561764 = score(doc=6751,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.30952093 = fieldWeight in 6751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=6751)
  0.5 = coord(1/2)

Abstract: Presents a system for summarizing quantitative data in natural language, focusing on the use of a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate the approach. Initial corpus analysis revealed characteristics of textual summaries that challenge the capabilities of current language generation systems. A revision based corpus analysis was used to identify and encode the revision rules of the system. Presents a quantitative evaluation, using several test corpora, to measure the robustness of the new revision based model
Date: 6. 3.1997 16:22:15

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.04
```
0.044312872 = product of:
  0.088625744 = sum of:
    0.088625744 = sum of:
      0.04620442 = weight(_text_:systems in 948) [ClassicSimilarity], result of:
        0.04620442 = score(doc=948,freq=4.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.28811008 = fieldWeight in 948, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.046875 = fieldNorm(doc=948)
      0.042421322 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
        0.042421322 = score(doc=948,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.23214069 = fieldWeight in 948, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=948)
  0.5 = coord(1/2)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.03
```
0.03128866 = product of:
  0.06257732 = sum of:
    0.06257732 = sum of:
      0.027226217 = weight(_text_:systems in 5290) [ClassicSimilarity], result of:
        0.027226217 = score(doc=5290,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.1697705 = fieldWeight in 5290, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
      0.0353511 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
        0.0353511 = score(doc=5290,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.19345059 = fieldWeight in 5290, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
  0.5 = coord(1/2)
```
Abstract

Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.

Date

22. 7.2006 17:25:48
Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.01
```
0.014147157 = product of:
  0.028294314 = sum of:
    0.028294314 = product of:
      0.056588627 = sum of:
        0.056588627 = weight(_text_:systems in 932) [ClassicSimilarity], result of:
          0.056588627 = score(doc=932,freq=6.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.35286134 = fieldWeight in 932, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.01

0.014140441 = product of:
  0.028280882 = sum of:
    0.028280882 = product of:
      0.056561764 = sum of:
        0.056561764 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.056561764 = score(doc=6599,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 26. 2.1997 10:22:43

McKeown, K.; Robin, J.; Kukich, K.: Generating concise natural language summaries (1995) 0.01

0.013613109 = product of:
  0.027226217 = sum of:
    0.027226217 = product of:
      0.054452434 = sum of:
        0.054452434 = weight(_text_:systems in 2932) [ClassicSimilarity], result of:
          0.054452434 = score(doc=2932,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.339541 = fieldWeight in 2932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.078125 = fieldNorm(doc=2932)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Description of the problems for summary generation, the applications developed (for basket ball games - STREAK and for telephone network planning activity - PLANDOC), the linguistic constructions that the systems use to convey information concisely and the textual constraints that determine what information gets included

Díaz, A.; Gervás, P.: User-model based personalized summarization (2007) 0.01
```
0.011551105 = product of:
  0.02310221 = sum of:
    0.02310221 = product of:
      0.04620442 = sum of:
        0.04620442 = weight(_text_:systems in 952) [ClassicSimilarity], result of:
          0.04620442 = score(doc=952,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.28811008 = fieldWeight in 952, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=952)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The potential of summary personalization is high, because a summary that would be useless to decide the relevance of a document if summarized in a generic manner, may be useful if the right sentences are selected that match the user interest. In this paper we defend the use of a personalized summarization facility to maximize the density of relevance of selections sent by a personalized information system to a given user. The personalization is applied to the digital newspaper domain and it used a user-model that stores long and short term interests using four reference systems: sections, categories, keywords and feedback terms. On the other side, it is crucial to measure how much information is lost during the summarization process, and how this information loss may affect the ability of the user to judge the relevance of a given document. The results obtained in two personalization systems show that personalized summaries perform better than generic and generic-personalized summaries in terms of identifying documents that satisfy user preferences. We also considered a user-centred direct evaluation that showed a high level of user satisfaction with the summaries.
Brandow, R.; Mitze, K.; Rau, L.F.: Automatic condensation of electronic publications by sentence selection (1995) 0.01
```
0.010890487 = product of:
  0.021780973 = sum of:
    0.021780973 = product of:
      0.043561947 = sum of:
        0.043561947 = weight(_text_:systems in 2929) [ClassicSimilarity], result of:
          0.043561947 = score(doc=2929,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2716328 = fieldWeight in 2929, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=2929)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Description of a system that performs domain-independent automatic condensation of news from a large commercial news service encompassing 41 different publications. This system was evaluated against a system that condensed the same articles using only the first portions of the texts (the löead), up to the target length of the summaries. 3 lengths of articles were evaluated for 250 documents by both systems, totalling 1.500 suitability judgements in all. The lead-based summaries outperformed the 'intelligent' summaries significantly, achieving acceptability ratings of over 90%, compared to 74,7%

Over, P.; Dang, H.; Harman, D.: DUC in context (2007) 0.01

0.010890487 = product of:
  0.021780973 = sum of:
    0.021780973 = product of:
      0.043561947 = sum of:
        0.043561947 = weight(_text_:systems in 934) [ClassicSimilarity], result of:
          0.043561947 = score(doc=934,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2716328 = fieldWeight in 934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=934)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype systems. Many factors can affect the design of such evaluations, requiring choices among competing alternatives. This paper examines several major themes running through three evaluations: SUMMAC, NTCIR, and DUC, with a concentration on DUC. The themes are extrinsic and intrinsic evaluation, evaluation procedures and methods, generic versus focused summaries, single- and multi-document summaries, length and compression issues, extracts versus abstracts, and issues with genre.

Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.01
```
0.009625921 = product of:
  0.019251842 = sum of:
    0.019251842 = product of:
      0.038503684 = sum of:
        0.038503684 = weight(_text_:systems in 601) [ClassicSimilarity], result of:
          0.038503684 = score(doc=601,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.24009174 = fieldWeight in 601, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=601)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.
Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
```
0.009625921 = product of:
  0.019251842 = sum of:
    0.019251842 = product of:
      0.038503684 = sum of:
        0.038503684 = weight(_text_:systems in 5400) [ClassicSimilarity], result of:
          0.038503684 = score(doc=5400,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.24009174 = fieldWeight in 5400, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.

Footnote

Beitrag eines Special Issue: Research Information Systems and Science Classifications; including papers from "Trajectories for Research: Fathoming the Promise of the NARCIS Classification," 27-28 September 2018, The Hague, The Netherlands.
Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.01
```
0.0094314385 = product of:
  0.018862877 = sum of:
    0.018862877 = product of:
      0.037725754 = sum of:
        0.037725754 = weight(_text_:systems in 947) [ClassicSimilarity], result of:
          0.037725754 = score(doc=947,freq=6.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2352409 = fieldWeight in 947, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03125 = fieldNorm(doc=947)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel integrated information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of methods in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) as measured by the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence "trimming" and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.

Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01

0.008837775 = product of:
  0.01767555 = sum of:
    0.01767555 = product of:
      0.0353511 = sum of:
        0.0353511 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
          0.0353511 = score(doc=2640,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.19345059 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2016 12:29:41

Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.01

0.008837775 = product of:
  0.01767555 = sum of:
    0.01767555 = product of:
      0.0353511 = sum of:
        0.0353511 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
          0.0353511 = score(doc=889,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.19345059 = fieldWeight in 889, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2023 18:57:12

Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01

0.008837775 = product of:
  0.01767555 = sum of:
    0.01767555 = product of:
      0.0353511 = sum of:
        0.0353511 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
          0.0353511 = score(doc=1012,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.19345059 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 6.2023 14:55:20

Endres-Niggemeyer, B.: Summarizing information (1998) 0.01
```
0.008167865 = product of:
  0.01633573 = sum of:
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = weight(_text_:systems in 688) [ClassicSimilarity], result of:
          0.03267146 = score(doc=688,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2037246 = fieldWeight in 688, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=688)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Summarizing is the process of reducing the large information size of something like a novel or a scientific paper to a short summary or abstract comprising only the most essential points. Summarizing is frequent in everyday communication, but it is also a professional skill for journalists and others. Automated summarizing functions are urgently needed by Internet users who wish to avoid being overwhelmed by information. This book presents the state of the art and surveys related research; it deals with everyday and professional summarizing as well as computerized approaches. The author focuses in detail on the cognitive pro-cess involved in summarizing and supports this with a multimedia simulation systems on the accompanying CD-ROM
Hirao, T.; Okumura, M.; Yasuda, N.; Isozaki, H.: Supervised automatic evaluation for summarization with voted regression model (2007) 0.01
```
0.008167865 = product of:
  0.01633573 = sum of:
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = weight(_text_:systems in 942) [ClassicSimilarity], result of:
          0.03267146 = score(doc=942,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2037246 = fieldWeight in 942, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=942)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The high quality evaluation of generated summaries is needed if we are to improve automatic summarization systems. Although human evaluation provides better results than automatic evaluation methods, its cost is huge and it is difficult to reproduce the results. Therefore, we need an automatic method that simulates human evaluation if we are to improve our summarization system efficiently. Although automatic evaluation methods have been proposed, they are unreliable when used for individual summaries. To solve this problem, we propose a supervised automatic evaluation method based on a new regression model called the voted regression model (VRM). VRM has two characteristics: (1) model selection based on 'corrected AIC' to avoid multicollinearity, (2) voting by the selected models to alleviate the problem of overfitting. Evaluation results obtained for TSC3 and DUC2004 show that our method achieved error reductions of about 17-51% compared with conventional automatic evaluation methods. Moreover, our method obtained the highest correlation coefficients in several different experiments.

Abdi, A.; Idris, N.; Alguliev, R.M.; Aliguliyev, R.M.: Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems (2015) 0.01

0.008167865 = product of:
  0.01633573 = sum of:
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = weight(_text_:systems in 2681) [ClassicSimilarity], result of:
          0.03267146 = score(doc=2681,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2037246 = fieldWeight in 2681, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=2681)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.01
```
0.0068065543 = product of:
  0.013613109 = sum of:
    0.013613109 = product of:
      0.027226217 = sum of:
        0.027226217 = weight(_text_:systems in 955) [ClassicSimilarity], result of:
          0.027226217 = score(doc=955,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.1697705 = fieldWeight in 955, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=955)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physician's evaluation of three randomly-selected papers from an evaluation corpus to show that the author's abstract does not always reflect the entire contents of the full-text.

Search (22 results, page 1 of 2)

Authors

Years

Types

Themes

Subjects