Search (95 results, page 5 of 5)

Hirao, T.; Okumura, M.; Yasuda, N.; Isozaki, H.: Supervised automatic evaluation for summarization with voted regression model (2007) 0.00
```
0.001757696 = product of:
  0.003515392 = sum of:
    0.003515392 = product of:
      0.007030784 = sum of:
        0.007030784 = weight(_text_:a in 942) [ClassicSimilarity], result of:
          0.007030784 = score(doc=942,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.13239266 = fieldWeight in 942, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=942)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The high quality evaluation of generated summaries is needed if we are to improve automatic summarization systems. Although human evaluation provides better results than automatic evaluation methods, its cost is huge and it is difficult to reproduce the results. Therefore, we need an automatic method that simulates human evaluation if we are to improve our summarization system efficiently. Although automatic evaluation methods have been proposed, they are unreliable when used for individual summaries. To solve this problem, we propose a supervised automatic evaluation method based on a new regression model called the voted regression model (VRM). VRM has two characteristics: (1) model selection based on 'corrected AIC' to avoid multicollinearity, (2) voting by the selected models to alleviate the problem of overfitting. Evaluation results obtained for TSC3 and DUC2004 show that our method achieved error reductions of about 17-51% compared with conventional automatic evaluation methods. Moreover, our method obtained the highest correlation coefficients in several different experiments.

Type

a
Shen, D.; Yang, Q.; Chen, Z.: Noise reduction through summarization for Web-page classification (2007) 0.00
```
0.001757696 = product of:
  0.003515392 = sum of:
    0.003515392 = product of:
      0.007030784 = sum of:
        0.007030784 = weight(_text_:a in 953) [ClassicSimilarity], result of:
          0.007030784 = score(doc=953,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.13239266 = fieldWeight in 953, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=953)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Due to a large variety of noisy information embedded in Web pages, Web-page classification is much more difficult than pure-text classification. In this paper, we propose to improve the Web-page classification performance by removing the noise through summarization techniques. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then put forward a new Web-page summarization algorithm based on Web-page layout and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that the classification algorithms (NB or SVM) augmented by any summarization approach can achieve an improvement by more than 5.0% as compared to pure-text-based classification algorithms. We further introduce an ensemble method to combine the different summarization algorithms. The ensemble summarization method achieves more than 12.0% improvement over pure-text based methods.

Type

a

Paice, C.D.: Automatic abstracting (1994) 0.00

0.0016913437 = product of:
  0.0033826875 = sum of:
    0.0033826875 = product of:
      0.006765375 = sum of:
        0.006765375 = weight(_text_:a in 917) [ClassicSimilarity], result of:
          0.006765375 = score(doc=917,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12739488 = fieldWeight in 917, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=917)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: The final report of the 2nd British Library abstracting project (the BLAB project), 1990-1992, which was carried out partly at the Computing Department of Lancaster University, and partly at the Centre for Computational Linguistics, UMIST. This project built on the results of the first project, of 1985-1987, to build a system designed create abstracts automatically from given texts

Johnson, F.C.; Paice, C.D.; Black, W.J.; Neal, A.P.: ¬The application of linguistic processing to automatic abstract generation (1993) 0.00

0.0016913437 = product of:
  0.0033826875 = sum of:
    0.0033826875 = product of:
      0.006765375 = sum of:
        0.006765375 = weight(_text_:a in 2290) [ClassicSimilarity], result of:
          0.006765375 = score(doc=2290,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12739488 = fieldWeight in 2290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=2290)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

McKeown, K.; Robin, J.; Kukich, K.: Generating concise natural language summaries (1995) 0.00

0.0016913437 = product of:
  0.0033826875 = sum of:
    0.0033826875 = product of:
      0.006765375 = sum of:
        0.006765375 = weight(_text_:a in 2932) [ClassicSimilarity], result of:
          0.006765375 = score(doc=2932,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12739488 = fieldWeight in 2932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=2932)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Rodríguez-Vidal, J.; Carrillo-de-Albornoz, J.; Gonzalo, J.; Plaza, L.: Authority and priority signals in automatic summary generation for online reputation management (2021) 0.00
```
0.0016913437 = product of:
  0.0033826875 = sum of:
    0.0033826875 = product of:
      0.006765375 = sum of:
        0.006765375 = weight(_text_:a in 213) [ClassicSimilarity], result of:
          0.006765375 = score(doc=213,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12739488 = fieldWeight in 213, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=213)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Online reputation management (ORM) comprises the collection of techniques that help monitoring and improving the public image of an entity (companies, products, institutions) on the Internet. The ORM experts try to minimize the negative impact of the information about an entity while maximizing the positive material for being more trustworthy to the customers. Due to the huge amount of information that is published on the Internet every day, there is a need to summarize the entire flow of information to obtain only those data that are relevant to the entities. Traditionally the automatic summarization task in the ORM scenario takes some in-domain signals into account such as popularity, polarity for reputation and novelty but exists other feature to be considered, the authority of the people. This authority depends on the ability to convince others and therefore to influence opinions. In this work, we propose the use of authority signals that measures the influence of a user jointly with (a) priority signals related to the ORM domain and (b) information regarding the different topics that influential people is talking about. Our results indicate that the use of authority signals may significantly improve the quality of the summaries that are automatically generated.

Type

a
Uyttendaele, C.; Moens, M.-F.; Dumortier, J.: SALOMON: automatic abstracting of legal cases for effective access to court decisions (1998) 0.00
```
0.001674345 = product of:
  0.00334869 = sum of:
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = weight(_text_:a in 495) [ClassicSimilarity], result of:
          0.00669738 = score(doc=495,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12611452 = fieldWeight in 495, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=495)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The SALOMON project summarises Belgian criminal cases in order to improve access to the large number of existing and future cases. A double methodology was used when developing SALOMON: the cases are processed by employing additional knowledge to interpret structural patterns and features on the one hand and by way of occurrence statistics of index terms on the other. SALOMON performs an initial categorisation and structuring of the cases and subsequently extracts the most relevant text units of the alleged offences and of the opinion of the court. The SALOMON techniques do not themselves solve any legal questions, but they do guide the use effectively towards relevant texts

Type

a
Craven, T.C.: ¬An experiment in the use of tools for computer-assisted abstracting (1996) 0.00
```
0.0014351527 = product of:
  0.0028703054 = sum of:
    0.0028703054 = product of:
      0.005740611 = sum of:
        0.005740611 = weight(_text_:a in 7426) [ClassicSimilarity], result of:
          0.005740611 = score(doc=7426,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10809815 = fieldWeight in 7426, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=7426)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Experimental subjects wrote abstracts of an article using a simplified version of the TEXNET abstracting assistance software. In addition to the fulltext, the 35 subjects were presented with either keywords or phrases extracted automatically. The resulting abstracts, and the times taken, were recorded automatically; some additional information was gathered by oral questionnaire. Results showed considerable variation among subjects, but 37% found the keywords or phrases quite or very useful in writing their abstracts. Statistical analysis failed to support deveral hypothesised relations; phrases were not viewed as significantly more helpful than keywords; and abstracting experience did not correlate with originality of wording, approximation of the author abstract, or greater conciseness. Results also suggested possible modifications to the software

Type

a
Pinto, M.: Abstracting/abstract adaptation to digital environments : research trends (2003) 0.00
```
0.0014351527 = product of:
  0.0028703054 = sum of:
    0.0028703054 = product of:
      0.005740611 = sum of:
        0.005740611 = weight(_text_:a in 4446) [ClassicSimilarity], result of:
          0.005740611 = score(doc=4446,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10809815 = fieldWeight in 4446, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4446)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The technological revolution is affecting the structure, form and content of documents, reducing the effectiveness of traditional abstracts that, to some extent, are inadequate to the new documentary conditions. Aims to show the directions in which abstracting/abstracts can evolve to achieve the necessary adequacy in the new digital environments. Three researching trends are proposed: theoretical, methodological and pragmatic. Theoretically, there are some needs for expanding the document concept, reengineering abstracting and designing interdisciplinary models. Methodologically, the trend is toward the structuring, automating and qualifying of the abstracts. Pragmatically, abstracts networking, combined with alternative and complementary models, open a new and promising horizon. Automating, structuring and qualifying abstracting/abstract offer some short-term prospects for progress. Concludes that reengineering, networking and visualising would be middle-term fruitful areas of research toward the full adequacy of abstracting in the new electronic age.

Type

a
Wang, W.; Hwang, D.: Abstraction Assistant : an automatic text abstraction system (2010) 0.00
```
0.0014351527 = product of:
  0.0028703054 = sum of:
    0.0028703054 = product of:
      0.005740611 = sum of:
        0.005740611 = weight(_text_:a in 3981) [ClassicSimilarity], result of:
          0.005740611 = score(doc=3981,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10809815 = fieldWeight in 3981, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3981)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In the interest of standardization and quality assurance, it is desirable for authors and staff of access services to follow the American National Standards Institute (ANSI) guidelines in preparing abstracts. Using the statistical approach an extraction system (the Abstraction Assistant) was developed to generate informative abstracts to meet the ANSI guidelines for structural content elements. The system performance is evaluated by comparing the system-generated abstracts with the author's original abstracts and the manually enhanced system abstracts on three criteria: balance (satisfaction of the ANSI standards), fluency (text coherence), and understandability (clarity). The results suggest that it is possible to use the system output directly without manual modification, but there are issues that need to be addressed in further studies to make the system a better tool.

Type

a

Summarising software for publishing (1996) 0.00

0.001353075 = product of:
  0.00270615 = sum of:
    0.00270615 = product of:
      0.0054123 = sum of:
        0.0054123 = weight(_text_:a in 5121) [ClassicSimilarity], result of:
          0.0054123 = score(doc=5121,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10191591 = fieldWeight in 5121, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=5121)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.00
```
0.0011959607 = product of:
  0.0023919214 = sum of:
    0.0023919214 = product of:
      0.0047838427 = sum of:
        0.0047838427 = weight(_text_:a in 601) [ClassicSimilarity], result of:
          0.0047838427 = score(doc=601,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.090081796 = fieldWeight in 601, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=601)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.

Type

a
Finegan-Dollak, C.; Radev, D.R.: Sentence simplification, compression, and disaggregation for summarization of sophisticated documents (2016) 0.00
```
0.0011959607 = product of:
  0.0023919214 = sum of:
    0.0023919214 = product of:
      0.0047838427 = sum of:
        0.0047838427 = weight(_text_:a in 3122) [ClassicSimilarity], result of:
          0.0047838427 = score(doc=3122,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.090081796 = fieldWeight in 3122, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3122)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Sophisticated documents like legal cases and biomedical articles can contain unusually long sentences. Extractive summarizers can select such sentences-potentially adding hundreds of unnecessary words to the summary-or exclude them and lose important content. Sentence simplification or compression seems on the surface to be a promising solution. However, compression removes words before the selection algorithm can use them, and simplification generates sentences that may be ambiguous in an extractive summary. We therefore compare the performance of an extractive summarizer selecting from the sentences of the original document with that of the summarizer selecting from sentences shortened in three ways: simplification, compression, and disaggregation, which splits one sentence into several according to rules designed to keep all meaning. We find that on legal cases and biomedical articles, these shortening methods generate ungrammatical output. Human evaluators performed an extrinsic evaluation consisting of comprehension questions about the summaries. Evaluators given compressed, simplified, or disaggregated versions of the summaries answered fewer questions correctly than did those given summaries with unaltered sentences. Error analysis suggests 2 causes: Altered sentences sometimes interact with the sentence selection algorithm, and alterations to sentences sometimes obscure information in the summary. We discuss future work to alleviate these problems.

Type

a

Moens, M.-F.: Summarizing court decisions (2007) 0.00

0.0011839407 = product of:
  0.0023678814 = sum of:
    0.0023678814 = product of:
      0.0047357627 = sum of:
        0.0047357627 = weight(_text_:a in 954) [ClassicSimilarity], result of:
          0.0047357627 = score(doc=954,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.089176424 = fieldWeight in 954, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=954)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.00

0.0010148063 = product of:
  0.0020296127 = sum of:
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = weight(_text_:a in 932) [ClassicSimilarity], result of:
          0.0040592253 = score(doc=932,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.07643694 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Search (95 results, page 5 of 5)

Authors

Years

Types

Themes

Subjects