Search (85 results, page 1 of 5)

Edmundson, H.P.; Wyllis, R.E.: Problems in automatic abstracting (1964) 0.34

0.33929676 = product of:
  0.45239568 = sum of:
    0.025779642 = weight(_text_:for in 3670) [ClassicSimilarity], result of:
      0.025779642 = score(doc=3670,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 3670, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.109375 = fieldNorm(doc=3670)
    0.22375791 = weight(_text_:computing in 3670) [ClassicSimilarity], result of:
      0.22375791 = score(doc=3670,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.85560554 = fieldWeight in 3670, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.109375 = fieldNorm(doc=3670)
    0.20285812 = product of:
      0.40571624 = sum of:
        0.40571624 = weight(_text_:machinery in 3670) [ClassicSimilarity], result of:
          0.40571624 = score(doc=3670,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.1521144 = fieldWeight in 3670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.109375 = fieldNorm(doc=3670)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 7(1964) no.1, S.259-263

Paice, C.D.: Automatic abstracting (1994) 0.09

0.08912055 = product of:
  0.1782411 = sum of:
    0.01841403 = weight(_text_:for in 917) [ClassicSimilarity], result of:
      0.01841403 = score(doc=917,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.20744109 = fieldWeight in 917, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.078125 = fieldNorm(doc=917)
    0.15982707 = weight(_text_:computing in 917) [ClassicSimilarity], result of:
      0.15982707 = score(doc=917,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.6111468 = fieldWeight in 917, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.078125 = fieldNorm(doc=917)
  0.5 = coord(2/4)

Abstract: The final report of the 2nd British Library abstracting project (the BLAB project), 1990-1992, which was carried out partly at the Computing Department of Lancaster University, and partly at the Centre for Computational Linguistics, UMIST. This project built on the results of the first project, of 1985-1987, to build a system designed create abstracts automatically from given texts

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.02

0.023227734 = product of:
  0.04645547 = sum of:
    0.020833097 = weight(_text_:for in 6751) [ClassicSimilarity], result of:
      0.020833097 = score(doc=6751,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.23469281 = fieldWeight in 6751, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0625 = fieldNorm(doc=6751)
    0.025622372 = product of:
      0.051244743 = sum of:
        0.051244743 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
          0.051244743 = score(doc=6751,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.30952093 = fieldWeight in 6751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6751)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Presents a system for summarizing quantitative data in natural language, focusing on the use of a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate the approach. Initial corpus analysis revealed characteristics of textual summaries that challenge the capabilities of current language generation systems. A revision based corpus analysis was used to identify and encode the revision rules of the system. Presents a quantitative evaluation, using several test corpora, to measure the robustness of the new revision based model
Date: 6. 3.1997 16:22:15

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.02

0.020176798 = product of:
  0.040353596 = sum of:
    0.014731225 = weight(_text_:for in 6599) [ClassicSimilarity], result of:
      0.014731225 = score(doc=6599,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.16595288 = fieldWeight in 6599, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0625 = fieldNorm(doc=6599)
    0.025622372 = product of:
      0.051244743 = sum of:
        0.051244743 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.051244743 = score(doc=6599,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 26. 2.1997 10:22:43
Source: Microcomputers for information management. 13(1996) no.1, S.41-55

Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.02
```
0.019283235 = product of:
  0.03856647 = sum of:
    0.022552488 = weight(_text_:for in 1012) [ClassicSimilarity], result of:
      0.022552488 = score(doc=1012,freq=12.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.2540624 = fieldWeight in 1012, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1012)
    0.016013984 = product of:
      0.032027967 = sum of:
        0.032027967 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
          0.032027967 = score(doc=1012,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.19345059 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.

Date

22. 6.2023 14:55:20

Source

Journal of the Association for Information Science and Technology. 74(2023) no.7, S.759-774
Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.02
```
0.0159805 = product of:
  0.031961 = sum of:
    0.01594702 = weight(_text_:for in 5290) [ClassicSimilarity], result of:
      0.01594702 = score(doc=5290,freq=6.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.17964928 = fieldWeight in 5290, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5290)
    0.016013984 = product of:
      0.032027967 = sum of:
        0.032027967 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.032027967 = score(doc=5290,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.

Date

22. 7.2006 17:25:48

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.6, S.740-752
Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.02
```
0.0159805 = product of:
  0.031961 = sum of:
    0.01594702 = weight(_text_:for in 889) [ClassicSimilarity], result of:
      0.01594702 = score(doc=889,freq=6.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.17964928 = fieldWeight in 889, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=889)
    0.016013984 = product of:
      0.032027967 = sum of:
        0.032027967 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
          0.032027967 = score(doc=889,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.19345059 = fieldWeight in 889, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.

Date

22. 1.2023 18:57:12

Source

Journal of the Association for Information Science and Technology. 74(2023) no.2, S.234-248
Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01
```
0.014517335 = product of:
  0.02903467 = sum of:
    0.013020686 = weight(_text_:for in 2640) [ClassicSimilarity], result of:
      0.013020686 = score(doc=2640,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.14668301 = fieldWeight in 2640, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2640)
    0.016013984 = product of:
      0.032027967 = sum of:
        0.032027967 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
          0.032027967 = score(doc=2640,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.19345059 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

We propose a tag-based framework that simulates human abstractors' ability to select significant sentences based on key concepts in a sentence as well as the semantic relations between key concepts to create generic summaries of transcribed lecture videos. The proposed extractive summarization method uses tags (viewer- and author-assigned terms) as key concepts. Our method employs Flickr tag clusters and WordNet synonyms to expand tags and detect the semantic relations between tags. This method helps select sentences that have a greater number of semantically related key concepts. To investigate the effectiveness and uniqueness of the proposed method, we compare it with an existing technique, latent semantic analysis (LSA), using intrinsic and extrinsic evaluations. The results of intrinsic evaluation show that the tag-based method is as or more effective than the LSA method. We also observe that in the extrinsic evaluation, the grand mean accuracy score of the tag-based method is higher than that of the LSA method, with a statistically significant difference. Elaborating on our results, we discuss the theoretical and practical implications of our findings for speech video summarization and retrieval.

Date

22. 1.2016 12:29:41

Source

Journal of the Association for Information Science and Technology. 67(2016) no.2, S.366-379
Liu, J.; Wu, Y.; Zhou, L.: ¬A hybrid method for abstracting newspaper articles (1999) 0.01
```
0.008235005 = product of:
  0.03294002 = sum of:
    0.03294002 = weight(_text_:for in 4059) [ClassicSimilarity], result of:
      0.03294002 = score(doc=4059,freq=10.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.37108192 = fieldWeight in 4059, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0625 = fieldNorm(doc=4059)
  0.25 = coord(1/4)
```
Abstract

This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information

Source

Journal of the American Society for Information Science. 50(1999) no.13, S.1234-1245

McKeown, K.; Robin, J.; Kukich, K.: Generating concise natural language summaries (1995) 0.01

0.00797351 = product of:
  0.03189404 = sum of:
    0.03189404 = weight(_text_:for in 2932) [ClassicSimilarity], result of:
      0.03189404 = score(doc=2932,freq=6.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.35929856 = fieldWeight in 2932, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.078125 = fieldNorm(doc=2932)
  0.25 = coord(1/4)

Abstract: Description of the problems for summary generation, the applications developed (for basket ball games - STREAK and for telephone network planning activity - PLANDOC), the linguistic constructions that the systems use to convey information concisely and the textual constraints that determine what information gets included

Marsh, E.: ¬A production rule system for message summarisation (1984) 0.01

0.006510343 = product of:
  0.026041372 = sum of:
    0.026041372 = weight(_text_:for in 1956) [ClassicSimilarity], result of:
      0.026041372 = score(doc=1956,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29336601 = fieldWeight in 1956, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.078125 = fieldNorm(doc=1956)
  0.25 = coord(1/4)

Source: Proceedings of the American Association for artificial intelligence

Kim, H.H.; Kim, Y.H.: Video summarization using event-related potential responses to shot boundaries in real-time video watching (2019) 0.01
```
0.006510343 = product of:
  0.026041372 = sum of:
    0.026041372 = weight(_text_:for in 4685) [ClassicSimilarity], result of:
      0.026041372 = score(doc=4685,freq=16.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29336601 = fieldWeight in 4685, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4685)
  0.25 = coord(1/4)
```
Abstract

Our aim was to develop an event-related potential (ERP)-based method to construct a video skim consisting of key shots to bridge the semantic gap between the topic inferred from a whole video and that from its summary. Mayer's cognitive model was examined, wherein the topic integration process of a user evoked by a visual stimulus can be associated with long-latency ERP components. We determined that long-latency ERP components are suitable for measuring a user's neuronal response through a literature review. We hypothesized that N300 is specific to the categorization of all shots regardless of topic relevance, N400 is specific for the semantic mismatching process for topic-irrelevant shots, and P600 is specific for the context updating process for topic-relevant shots. In our experiment, the N400 component led to more negative ERP signals in response to topic-irrelevant shots than to topic-relevant shots and showed a fronto-central scalp pattern. P600 elicited more positive ERP signals for topic-relevant shots than for topic-irrelevant shots and showed a fronto-central scalp pattern. We used discriminant and artificial neural network (ANN) analyses to decode video shot relevance and observed that the ANN produced particularly high success rates: 91.3% from the training set and 100% from the test set.

Source

Journal of the Association for Information Science and Technology. 70(2019) no.2, S.164-175
Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.01
```
0.0064449105 = product of:
  0.025779642 = sum of:
    0.025779642 = weight(_text_:for in 944) [ClassicSimilarity], result of:
      0.025779642 = score(doc=944,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 944, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=944)
  0.25 = coord(1/4)
```
Abstract

This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.
Moens, M.-F.: Summarizing court decisions (2007) 0.01
```
0.0064449105 = product of:
  0.025779642 = sum of:
    0.025779642 = weight(_text_:for in 954) [ClassicSimilarity], result of:
      0.025779642 = score(doc=954,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 954, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=954)
  0.25 = coord(1/4)
```
Abstract

In the field of law there is an absolute need for summarizing the texts of court decisions in order to make the content of the cases easily accessible for legal professionals. During the SALOMON and MOSAIC projects we investigated the summarization and retrieval of legal cases. This article presents some of the main findings while integrating the research results of experiments on legal document summarization by other research groups. In addition, we propose novel avenues of research for automatic text summarization, which we currently exploit when summarizing court decisions in the ACILA project. Techniques for automated concept learning and argument recognition are here the most challenging.
Lee, J.-H.; Park, S.; Ahn, C.-M.; Kim, D.: Automatic generic document summarization based on non-negative matrix factorization (2009) 0.01
```
0.0064449105 = product of:
  0.025779642 = sum of:
    0.025779642 = weight(_text_:for in 2448) [ClassicSimilarity], result of:
      0.025779642 = score(doc=2448,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 2448, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2448)
  0.25 = coord(1/4)
```
Abstract

In existing unsupervised methods, Latent Semantic Analysis (LSA) is used for sentence selection. However, the obtained results are less meaningful, because singular vectors are used as the bases for sentence selection from given documents, and singular vector components can have negative values. We propose a new unsupervised method using Non-negative Matrix Factorization (NMF) to select sentences for automatic generic document summarization. The proposed method uses non-negative constraints, which are more similar to the human cognition process. As a result, the method selects more meaningful sentences for generic document summarization than those selected using LSA.
Marcu, D.: Automatic abstracting and summarization (2009) 0.01
```
0.0064449105 = product of:
  0.025779642 = sum of:
    0.025779642 = weight(_text_:for in 3748) [ClassicSimilarity], result of:
      0.025779642 = score(doc=3748,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 3748, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3748)
  0.25 = coord(1/4)
```
Abstract

After lying dormant for a few decades, the field of automated text summarization has experienced a tremendous resurgence of interest. Recently, many new algorithms and techniques have been proposed for identifying important information in single documents and document collections, and for mapping this information into grammatical, cohesive, and coherent abstracts. Since 1997, annual workshops, conferences, and large-scale comparative evaluations have provided a rich environment for exchanging ideas between researchers in Asia, Europe, and North America. This entry reviews the main developments in the field and provides a guiding map to those interested in understanding the strengths and weaknesses of an increasingly ubiquitous technology.

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.01

0.006405593 = product of:
  0.025622372 = sum of:
    0.025622372 = product of:
      0.051244743 = sum of:
        0.051244743 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
          0.051244743 = score(doc=6974,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.30952093 = fieldWeight in 6974, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6974)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Craven, T.C.: ¬A phrase flipper for the assistance of writers of abstracts and other text (1995) 0.01

0.0063788076 = product of:
  0.02551523 = sum of:
    0.02551523 = weight(_text_:for in 4897) [ClassicSimilarity], result of:
      0.02551523 = score(doc=4897,freq=6.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.28743884 = fieldWeight in 4897, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0625 = fieldNorm(doc=4897)
  0.25 = coord(1/4)

Abstract: Describes computerized tools for computer assisted abstracting. FlipPhr is a Microsoft Windows application program that rearranges (flips) phrases or other expressions in accordance with rules in a grammar. The flipping may be invoked with a single keystroke from within various Windows application programs that allow cutting and pasting of text. The user may modify the grammar to provide for different kinds of flipping

Dorr, B.J.; Gaasterland, T.: Exploiting aspectual features and connecting words for summarization-inspired temporal-relation extraction (2007) 0.01
```
0.0061762533 = product of:
  0.024705013 = sum of:
    0.024705013 = weight(_text_:for in 950) [ClassicSimilarity], result of:
      0.024705013 = score(doc=950,freq=10.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.27831143 = fieldWeight in 950, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=950)
  0.25 = coord(1/4)
```
Abstract

This paper presents a model that incorporates contemporary theories of tense and aspect and develops a new framework for extracting temporal relations between two sentence-internal events, given their tense, aspect, and a temporal connecting word relating the two events. A linguistic constraint on event combination has been implemented to detect incorrect parser analyses and potentially apply syntactic reanalysis or semantic reinterpretation - in preparation for subsequent processing for multi-document summarization. An important contribution of this work is the extension of two different existing theoretical frameworks - Hornstein's 1990 theory of tense analysis and Allen's 1984 theory on event ordering - and the combination of both into a unified system for representing and constraining combinations of different event types (points, closed intervals, and open-ended intervals). We show that our theoretical results have been verified in a large-scale corpus analysis. The framework is designed to inform a temporally motivated sentence-ordering module in an implemented multi-document summarization system.
Ling, X.; Jiang, J.; He, X.; Mei, Q.; Zhai, C.; Schatz, B.: Generating gene summaries from biomedical literature : a study of semi-structured summarization (2007) 0.01
```
0.006089868 = product of:
  0.024359472 = sum of:
    0.024359472 = weight(_text_:for in 946) [ClassicSimilarity], result of:
      0.024359472 = score(doc=946,freq=14.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.27441877 = fieldWeight in 946, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=946)
  0.25 = coord(1/4)
```
Abstract

Most knowledge accumulated through scientific discoveries in genomics and related biomedical disciplines is buried in the vast amount of biomedical literature. Since understanding gene regulations is fundamental to biomedical research, summarizing all the existing knowledge about a gene based on literature is highly desirable to help biologists digest the literature. In this paper, we present a study of methods for automatically generating gene summaries from biomedical literature. Unlike most existing work on automatic text summarization, in which the generated summary is often a list of extracted sentences, we propose to generate a semi-structured summary which consists of sentences covering specific semantic aspects of a gene. Such a semi-structured summary is more appropriate for describing genes and poses special challenges for automatic text summarization. We propose a two-stage approach to generate such a summary for a given gene - first retrieving articles about a gene and then extracting sentences for each specified semantic aspect. We address the issue of gene name variation in the first stage and propose several different methods for sentence extraction in the second stage. We evaluate the proposed methods using a test set with 20 genes. Experiment results show that the proposed methods can generate useful semi-structured gene summaries automatically from biomedical literature, and our proposed methods outperform general purpose summarization methods. Among all the proposed methods for sentence extraction, a probabilistic language modeling approach that models gene context performs the best.

Search (85 results, page 1 of 5)

Authors

Years

Languages

Types

Themes

Subjects