Search (114 results, page 2 of 6)

Xianghao, G.; Yixin, Z.; Li, Y.: ¬A new method of news test understanding and abstracting based on speech acts theory (1998) 0.00

0.0030255679 = product of:
  0.0060511357 = sum of:
    0.0060511357 = product of:
      0.012102271 = sum of:
        0.012102271 = weight(_text_:a in 3532) [ClassicSimilarity], result of:
          0.012102271 = score(doc=3532,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22789092 = fieldWeight in 3532, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=3532)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Presents a method for the automated analysis and comprehension of foreign affairs news produced by a Chinese news agency. Notes that the development of the method was prededed by a study of the structuring rules of the news. Describes how an abstract of the news story is produced automatically from the analysis. Stresses the main aim of the work which is to use specch act theory to analyse and classify sentences
Type: a

Liu, J.; Wu, Y.; Zhou, L.: ¬A hybrid method for abstracting newspaper articles (1999) 0.00

0.0030255679 = product of:
  0.0060511357 = sum of:
    0.0060511357 = product of:
      0.012102271 = sum of:
        0.012102271 = weight(_text_:a in 4059) [ClassicSimilarity], result of:
          0.012102271 = score(doc=4059,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22789092 = fieldWeight in 4059, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=4059)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
Type: a

Sweeney, S.; Crestani, F.; Losada, D.E.: 'Show me more' : incremental length summarisation using novelty detection (2008) 0.00
```
0.0029294936 = product of:
  0.005858987 = sum of:
    0.005858987 = product of:
      0.011717974 = sum of:
        0.011717974 = weight(_text_:a in 2054) [ClassicSimilarity], result of:
          0.011717974 = score(doc=2054,freq=24.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22065444 = fieldWeight in 2054, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2054)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The paper presents a study investigating the effects of incorporating novelty detection in automatic text summarisation. Condensing a textual document, automatic text summarisation can reduce the need to refer to the source document. It also offers a means to deliver device-friendly content when accessing information in non-traditional environments. An effective method of summarisation could be to produce a summary that includes only novel information. However, a consequence of focusing exclusively on novel parts may result in a loss of context, which may have an impact on the correct interpretation of the summary, with respect to the source document. In this study we compare two strategies to produce summaries that incorporate novelty in different ways: a constant length summary, which contains only novel sentences, and an incremental summary, containing additional sentences that provide context. The aim is to establish whether a summary that contains only novel sentences provides sufficient basis to determine relevance of a document, or if indeed we need to include additional sentences to provide context. Findings from the study seem to suggest that there is only a minimal difference in performance for the tasks we set our users and that the presence of contextual information is not so important. However, for the case of mobile information access, a summary that contains only novel information does offer benefits, given bandwidth constraints.

Type

a
Kim, H.H.; Kim, Y.H.: ERP/MMR algorithm for classifying topic-relevant and topic-irrelevant visual shots of documentary videos (2019) 0.00
```
0.0029294936 = product of:
  0.005858987 = sum of:
    0.005858987 = product of:
      0.011717974 = sum of:
        0.011717974 = weight(_text_:a in 5358) [ClassicSimilarity], result of:
          0.011717974 = score(doc=5358,freq=24.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22065444 = fieldWeight in 5358, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5358)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We propose and evaluate a video summarization method based on a topic relevance model, a maximal marginal relevance (MMR), and discriminant analysis to generate a semantically meaningful video skim. The topic relevance model uses event-related potential (ERP) components to describe the process of topic relevance judgment. More specifically, the topic relevance model indicates that N400 and P600, which have been successfully applied to the mismatch process of a stimulus and the discourse-internal reorganization and integration process of a stimulus, respectively, are used for the topic mismatch process of a topic-irrelevant video shot and the topic formation process of a topic-relevant video shot. To evaluate our proposed ERP/MMR-based method, we compared the video skims generated by the ERP/MMR-based, ERP-based, and shot boundary detection (SBD) methods with ground truth skims. The results showed that at a significance level of 0.05, the ROUGE-1 scores of the ERP/MMR method are statistically higher than those of the SBD method, and the diversity scores of the ERP/MMR method are statistically higher than those of the ERP method. This study suggested that the proposed method may be applied to the construction of a video skim without operational intervention, such as the insertion of a black screen between video shots.

Type

a
Endres-Niggemeyer, B.; Maier, E.; Sigel, A.: How to implement a naturalistic model of abstracting : four core working steps of an expert abstractor (1995) 0.00
```
0.0029000505 = product of:
  0.005800101 = sum of:
    0.005800101 = product of:
      0.011600202 = sum of:
        0.011600202 = weight(_text_:a in 2930) [ClassicSimilarity], result of:
          0.011600202 = score(doc=2930,freq=12.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.21843673 = fieldWeight in 2930, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

4 working steps taken from a comprehensive empirical model of expert abstracting are studied in order to prepare an explorative implementation of a simulation model. It aims at explaining the knowledge processing activities during professional summarizing. Following the case-based and holistic strategy of qualitative empirical research, the main features of the simulation system were developed by investigating in detail a small but central test case - 4 working steps where an expert abstractor discovers what the paper is about and drafts the topic sentence of the abstract

Type

a

Endres-Niggemeyer, B.: ¬An empirical process model of abstracting (1992) 0.00

0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 8834) [ClassicSimilarity], result of:
          0.011481222 = score(doc=8834,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 8834, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=8834)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Mensch und Maschine: Informationelle Schnittstellen der Kommunikation. Proc. des 3. Int. Symposiums für Informationswissenschaft (ISI'92), 5.-7.11.1992 in Saarbrücken. Hrsg.: H.H. Zimmermann, H.-D. Luckhardt u. A. Schulz
Type: a

Johnson, F.C.: ¬A critical view of system-centered to user-centered evaluation of automatic abstracting research (1999) 0.00

0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 2994) [ClassicSimilarity], result of:
          0.011481222 = score(doc=2994,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 2994, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=2994)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Dorr, B.J.; Gaasterland, T.: Exploiting aspectual features and connecting words for summarization-inspired temporal-relation extraction (2007) 0.00
```
0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 950) [ClassicSimilarity], result of:
          0.011481222 = score(doc=950,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 950, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=950)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper presents a model that incorporates contemporary theories of tense and aspect and develops a new framework for extracting temporal relations between two sentence-internal events, given their tense, aspect, and a temporal connecting word relating the two events. A linguistic constraint on event combination has been implemented to detect incorrect parser analyses and potentially apply syntactic reanalysis or semantic reinterpretation - in preparation for subsequent processing for multi-document summarization. An important contribution of this work is the extension of two different existing theoretical frameworks - Hornstein's 1990 theory of tense analysis and Allen's 1984 theory on event ordering - and the combination of both into a unified system for representing and constraining combinations of different event types (points, closed intervals, and open-ended intervals). We show that our theoretical results have been verified in a large-scale corpus analysis. The framework is designed to inform a temporally motivated sentence-ordering module in an implemented multi-document summarization system.

Type

a
Martinez-Romo, J.; Araujo, L.; Fernandez, A.D.: SemGraph : extracting keyphrases following a novel semantic graph-based approach (2016) 0.00
```
0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 2832) [ClassicSimilarity], result of:
          0.011481222 = score(doc=2832,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 2832, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2832)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Keyphrases represent the main topics a text is about. In this article, we introduce SemGraph, an unsupervised algorithm for extracting keyphrases from a collection of texts based on a semantic relationship graph. The main novelty of this algorithm is its ability to identify semantic relationships between words whose presence is statistically significant. Our method constructs a co-occurrence graph in which words appearing in the same document are linked, provided their presence in the collection is statistically significant with respect to a null model. Furthermore, the graph obtained is enriched with information from WordNet. We have used the most recent and standardized benchmark to evaluate the system ability to detect the keyphrases that are part of the text. The result is a method that achieves an improvement of 5.3% and 7.28% in F measure over the two labeled sets of keyphrases used in the evaluation of SemEval-2010.

Type

a
Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.00
```
0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 5089) [ClassicSimilarity], result of:
          0.011481222 = score(doc=5089,freq=36.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 5089, product of:
              6.0 = tf(freq=36.0), with freq of:
                36.0 = termFreq=36.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03125 = fieldNorm(doc=5089)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users' needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.

Type

a
Kannan, R.; Ghinea, G.; Swaminathan, S.: What do you wish to see? : A summarization system for movies based on user preferences (2015) 0.00
```
0.0027894354 = product of:
  0.005578871 = sum of:
    0.005578871 = product of:
      0.011157742 = sum of:
        0.011157742 = weight(_text_:a in 2683) [ClassicSimilarity], result of:
          0.011157742 = score(doc=2683,freq=34.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.21010503 = fieldWeight in 2683, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03125 = fieldNorm(doc=2683)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Video summarization aims at producing a compact version of a full-length video while preserving the significant content of the original video. Movie summarization condenses a full-length movie into a summary that still retains the most significant and interesting content of the original movie. In the past, several movie summarization systems have been proposed to generate a movie summary based on low-level video features such as color, motion, texture, etc. However, a generic summary, which is common to everyone and is produced based only on low-level video features will not satisfy every user. As users' preferences for the summary differ vastly for the same movie, there is a need for a personalized movie summarization system nowadays. To address this demand, this paper proposes a novel system to generate semantically meaningful video summaries for the same movie, which are tailored to the preferences and interests of a user. For a given movie, shots and scenes are automatically detected and their high-level features are semi-automatically annotated. Preferences over high-level movie features are explicitly collected from the user using a query interface. The user preferences are generated by means of a stored-query. Movie summaries are generated at shot level and scene level, where shots or scenes are selected for summary skim based on the similarity measured between shots and scenes, and the user's preferences. The proposed movie summarization system is evaluated subjectively using a sample of 20 subjects with eight movies in the English language. The quality of the generated summaries is assessed by informativeness, enjoyability, relevance, and acceptance metrics and Quality of Perception measures. Further, the usability of the proposed summarization system is subjectively evaluated by conducting a questionnaire survey. The experimental results on the performance of the proposed movie summarization approach show the potential of the proposed system.

Type

a

Craven, T.C.: ¬A computer-aided abstracting tool kit (1993) 0.00

0.00270615 = product of:
  0.0054123 = sum of:
    0.0054123 = product of:
      0.0108246 = sum of:
        0.0108246 = weight(_text_:a in 6506) [ClassicSimilarity], result of:
          0.0108246 = score(doc=6506,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20383182 = fieldWeight in 6506, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=6506)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Describes the abstracting assistance features being prototyped in the TEXNET text network management system. Sentence weighting methods include: weithing negatively or positively on the stems in a selected passage; weighting on general lists of cue words, adjusting weights of selected segments; and weighting of occurrence of frequent stems. The user may adjust a number of parameters: the minimum strength of extracts; the threshold for frequent word/stems and the amount sentence weight is to be adjusted for each weighting type
Type: a

Paice, C.D.: Automatic abstracting (1994) 0.00

0.00270615 = product of:
  0.0054123 = sum of:
    0.0054123 = product of:
      0.0108246 = sum of:
        0.0108246 = weight(_text_:a in 1255) [ClassicSimilarity], result of:
          0.0108246 = score(doc=1255,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20383182 = fieldWeight in 1255, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.125 = fieldNorm(doc=1255)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Brandow, R.; Mitze, K.; Rau, L.F.: Automatic condensation of electronic publications by sentence selection (1995) 0.00
```
0.00270615 = product of:
  0.0054123 = sum of:
    0.0054123 = product of:
      0.0108246 = sum of:
        0.0108246 = weight(_text_:a in 2929) [ClassicSimilarity], result of:
          0.0108246 = score(doc=2929,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20383182 = fieldWeight in 2929, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=2929)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Description of a system that performs domain-independent automatic condensation of news from a large commercial news service encompassing 41 different publications. This system was evaluated against a system that condensed the same articles using only the first portions of the texts (the löead), up to the target length of the summaries. 3 lengths of articles were evaluated for 250 documents by both systems, totalling 1.500 suitability judgements in all. The lead-based summaries outperformed the 'intelligent' summaries significantly, achieving acceptability ratings of over 90%, compared to 74,7%

Type

a

Sjöbergh, J.: Older versions of the ROUGEeval summarization evaluation system were easier to fool (2007) 0.00

0.00270615 = product of:
  0.0054123 = sum of:
    0.0054123 = product of:
      0.0108246 = sum of:
        0.0108246 = weight(_text_:a in 940) [ClassicSimilarity], result of:
          0.0108246 = score(doc=940,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20383182 = fieldWeight in 940, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=940)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: We show some limitations of the ROUGE evaluation method for automatic summarization. We present a method for automatic summarization based on a Markov model of the source text. By a simple greedy word selection strategy, summaries with high ROUGE-scores are generated. These summaries would however not be considered good by human readers. The method can be adapted to trick different settings of the ROUGEeval package.
Type: a

Maybury, M.T.: Generating summaries from event data (1995) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 2349) [ClassicSimilarity], result of:
          0.010739701 = score(doc=2349,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 2349, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2349)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Summarization entails analysis of source material, selection of key information, condensation of this, and generation of a compct summary form. While there habe been many investigations into the automatic summarization of text, relatively little attention has been given to the summarization of information from structured information sources such as data of knowledge bases, despite this being a desirable capability for a number of application areas including report generation from databases (e.g. weather, financial, medical) and simulation (e.g. military, manufacturing, aconomic). After a brief introduction indicating the main elements of summarization and referring to some illustrative approaches to it, considers pecific issues in the generation of text summaries of event data, describes a system, SumGen, which selects key information from an event database by reasoning about event frequencies, frequencies of relations between events, and domain specific importance measures. Describes how Sum Gen then aggregates similar information and plans a summary presentations tailored to stereotypical users

Type

a
Ye, S.; Chua, T.-S.; Kan, M.-Y.; Qiu, L.: Document concept lattice for text understanding and summarization (2007) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 941) [ClassicSimilarity], result of:
          0.010739701 = score(doc=941,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 941, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=941)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We argue that the quality of a summary can be evaluated based on how many concepts in the original document(s) that can be preserved after summarization. Here, a concept refers to an abstract or concrete entity or its action often expressed by diverse terms in text. Summary generation can thus be considered as an optimization problem of selecting a set of sentences with minimal answer loss. In this paper, we propose a document concept lattice that indexes the hierarchy of local topics tied to a set of frequent concepts and the corresponding sentences containing these topics. The local topics will specify the promising sub-spaces related to the selected concepts and sentences. Based on this lattice, the summary is an optimized selection of a set of distinct and salient local topics that lead to maximal coverage of concepts with the given number of sentences. Our summarizer based on the concept lattice has demonstrated competitive performance in Document Understanding Conference 2005 and 2006 evaluations as well as follow-on tests.

Type

a
Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 945) [ClassicSimilarity], result of:
          0.010739701 = score(doc=945,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 945, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].

Type

a
Salton, G.: Automatic text structuring and summarization (1997) 0.00
```
0.0026473717 = product of:
  0.0052947435 = sum of:
    0.0052947435 = product of:
      0.010589487 = sum of:
        0.010589487 = weight(_text_:a in 145) [ClassicSimilarity], result of:
          0.010589487 = score(doc=145,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.19940455 = fieldWeight in 145, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=145)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Applies the ideas from the automatic link generation research to automatic text summarisation. Using techniques for inter-document link generation, generates intra-document links between passages of a document. Based on the intra-document linkage pattern of a text, characterises the structure of the text. Applies the knowledge of text structure to do automatic text summarisation by passage extraction. Evaluates a set of 50 summaries generated using these techniques by comparing the to paragraph extracts constructed by humans. The automatic summarisation methods perform well, especially in view of the fact that the summaries generates by 2 humans for the same article are surprisingly dissimilar

Footnote

Contribution to a special issue on methods and tools for the automatic construction of hypertext

Type

a
Moens, M.F.; Dumortier, J.: Use of a text grammar for generating highlight abstracts of magazine articles (2000) 0.00
```
0.0026473717 = product of:
  0.0052947435 = sum of:
    0.0052947435 = product of:
      0.010589487 = sum of:
        0.010589487 = weight(_text_:a in 4540) [ClassicSimilarity], result of:
          0.010589487 = score(doc=4540,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.19940455 = fieldWeight in 4540, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4540)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Browsing a database of article abstracts is one way to select and buy relevant magazine articles online. Our research contributes to the design and development of text grammars for abstracting texts in unlimited subject domains. We developed a system that parses texts based on the text grammar of a specific text type and that extracts sentences and statements which are relevant for inclusion in the abstracts. The system employs knowledge of the discourse patterns that are typical of news stories. The results are encouraging and demonstrate the importance of discourse structures in text summarisation.

Type

a

Search (114 results, page 2 of 6)

Authors

Years

Languages

Types

Themes

Subjects