Search (62 results, page 1 of 4)

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.06

0.063940994 = product of:
  0.14919566 = sum of:
    0.056486145 = weight(_text_:digital in 6599) [ClassicSimilarity], result of:
      0.056486145 = score(doc=6599,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.34865242 = fieldWeight in 6599, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0625 = fieldNorm(doc=6599)
    0.07045048 = weight(_text_:techniques in 6599) [ClassicSimilarity], result of:
      0.07045048 = score(doc=6599,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3893711 = fieldWeight in 6599, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0625 = fieldNorm(doc=6599)
    0.022259047 = product of:
      0.044518095 = sum of:
        0.044518095 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.044518095 = score(doc=6599,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
  0.42857143 = coord(3/7)

Abstract: With the onset of the information explosion arising from digital libraries and access to a wealth of information through the Internet, the need to efficiently determine the relevance of a document becomes even more urgent. Describes a text extraction system (TES), which retrieves a set of sentences from a document to form an indicative abstract. Such an automated process enables information to be filtered more quickly. Discusses the combination of various text extraction techniques. Compares results with manually produced abstracts
Date: 26. 2.1997 10:22:43

Abdi, A.; Idris, N.; Alguliev, R.M.; Aliguliyev, R.M.: Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems (2015) 0.04
```
0.04294137 = product of:
  0.15029478 = sum of:
    0.04461906 = weight(_text_:processing in 2681) [ClassicSimilarity], result of:
      0.04461906 = score(doc=2681,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 2681, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=2681)
    0.10567571 = weight(_text_:techniques in 2681) [ClassicSimilarity], result of:
      0.10567571 = score(doc=2681,freq=8.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.5840566 = fieldWeight in 2681, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=2681)
  0.2857143 = coord(2/7)
```
Abstract

Summary writing is a process for creating a short version of a source text. It can be used as a measure of understanding. As grading students' summaries is a very time-consuming task, computer-assisted assessment can help teachers perform the grading more effectively. Several techniques, such as BLEU, ROUGE, N-gram co-occurrence, Latent Semantic Analysis (LSA), LSA_Ngram and LSA_ERB, have been proposed to support the automatic assessment of students' summaries. Since these techniques are more suitable for long texts, their performance is not satisfactory for the evaluation of short summaries. This paper proposes a specialized method that works well in assessing short summaries. Our proposed method integrates the semantic relations between words, and their syntactic composition. As a result, the proposed method is able to obtain high accuracy and improve the performance compared with the current techniques. Experiments have displayed that it is to be preferred over the existing techniques. A summary evaluation system based on the proposed method has also been developed.

Source

Information processing and management. 51(2015) no.4, S.340-358

Salton, G.: Automatic text structuring and summarization (1997) 0.04

0.039781027 = product of:
  0.13923359 = sum of:
    0.05205557 = weight(_text_:processing in 145) [ClassicSimilarity], result of:
      0.05205557 = score(doc=145,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 145, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=145)
    0.08717802 = weight(_text_:techniques in 145) [ClassicSimilarity], result of:
      0.08717802 = score(doc=145,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.48182213 = fieldWeight in 145, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0546875 = fieldNorm(doc=145)
  0.2857143 = coord(2/7)

Abstract: Applies the ideas from the automatic link generation research to automatic text summarisation. Using techniques for inter-document link generation, generates intra-document links between passages of a document. Based on the intra-document linkage pattern of a text, characterises the structure of the text. Applies the knowledge of text structure to do automatic text summarisation by passage extraction. Evaluates a set of 50 summaries generated using these techniques by comparing the to paragraph extracts constructed by humans. The automatic summarisation methods perform well, especially in view of the fact that the summaries generates by 2 humans for the same article are surprisingly dissimilar
Source: Information processing and management. 33(1997) no.2, S.193-207

Summarising software for publishing (1996) 0.04

0.036267605 = product of:
  0.12693661 = sum of:
    0.056486145 = weight(_text_:digital in 5121) [ClassicSimilarity], result of:
      0.056486145 = score(doc=5121,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.34865242 = fieldWeight in 5121, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0625 = fieldNorm(doc=5121)
    0.07045048 = weight(_text_:techniques in 5121) [ClassicSimilarity], result of:
      0.07045048 = score(doc=5121,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3893711 = fieldWeight in 5121, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0625 = fieldNorm(doc=5121)
  0.2857143 = coord(2/7)

Abstract: Reviews 4 software packages designed to provide accurate and indicative summaries of documents by taking the documents and creating distinctive abstracts from them. The products reviewed are: Oracle's ConText; InText's Object Analyzer; Iconovex's AnchorPage; and Software Scientific's Interrogator. Techniques used by the products include: the use of dictionaries of known words and phrases to interpret documents; and heuristic analysis involving weighting all the words in the document solely on their occurrence and position within the document
Source: Digital publisher. 1(1996) no.4, S.15-20

Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.03

0.03248564 = product of:
  0.11369974 = sum of:
    0.05205557 = weight(_text_:processing in 944) [ClassicSimilarity], result of:
      0.05205557 = score(doc=944,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=944)
    0.06164417 = weight(_text_:techniques in 944) [ClassicSimilarity], result of:
      0.06164417 = score(doc=944,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3406997 = fieldWeight in 944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0546875 = fieldNorm(doc=944)
  0.2857143 = coord(2/7)

Abstract: This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.
Source: Information processing and management. 43(2007) no.6, S.1549-1570

Moens, M.-F.: Summarizing court decisions (2007) 0.03

0.03248564 = product of:
  0.11369974 = sum of:
    0.05205557 = weight(_text_:processing in 954) [ClassicSimilarity], result of:
      0.05205557 = score(doc=954,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 954, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=954)
    0.06164417 = weight(_text_:techniques in 954) [ClassicSimilarity], result of:
      0.06164417 = score(doc=954,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3406997 = fieldWeight in 954, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0546875 = fieldNorm(doc=954)
  0.2857143 = coord(2/7)

Abstract: In the field of law there is an absolute need for summarizing the texts of court decisions in order to make the content of the cases easily accessible for legal professionals. During the SALOMON and MOSAIC projects we investigated the summarization and retrieval of legal cases. This article presents some of the main findings while integrating the research results of experiments on legal document summarization by other research groups. In addition, we propose novel avenues of research for automatic text summarization, which we currently exploit when summarizing court decisions in the ACILA project. Techniques for automated concept learning and argument recognition are here the most challenging.
Source: Information processing and management. 43(2007) no.6, S.1748-1764

Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.03
```
0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 945) [ClassicSimilarity], result of:
      0.04461906 = score(doc=945,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 945, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=945)
    0.052837856 = weight(_text_:techniques in 945) [ClassicSimilarity], result of:
      0.052837856 = score(doc=945,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 945, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=945)
  0.2857143 = coord(2/7)
```
Abstract

The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].

Source

Information processing and management. 43(2007) no.6, S.1571-1587
Shen, D.; Yang, Q.; Chen, Z.: Noise reduction through summarization for Web-page classification (2007) 0.03
```
0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 953) [ClassicSimilarity], result of:
      0.04461906 = score(doc=953,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 953, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=953)
    0.052837856 = weight(_text_:techniques in 953) [ClassicSimilarity], result of:
      0.052837856 = score(doc=953,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 953, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=953)
  0.2857143 = coord(2/7)
```
Abstract

Due to a large variety of noisy information embedded in Web pages, Web-page classification is much more difficult than pure-text classification. In this paper, we propose to improve the Web-page classification performance by removing the noise through summarization techniques. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then put forward a new Web-page summarization algorithm based on Web-page layout and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that the classification algorithms (NB or SVM) augmented by any summarization approach can achieve an improvement by more than 5.0% as compared to pure-text-based classification algorithms. We further introduce an ensemble method to combine the different summarization algorithms. The ensemble summarization method achieves more than 12.0% improvement over pure-text based methods.

Source

Information processing and management. 43(2007) no.6, S.1735-1747
Díaz, A.; Gervás, P.: User-model based personalized summarization (2007) 0.02
```
0.024852479 = product of:
  0.08698367 = sum of:
    0.04461906 = weight(_text_:processing in 952) [ClassicSimilarity], result of:
      0.04461906 = score(doc=952,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 952, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=952)
    0.042364612 = weight(_text_:digital in 952) [ClassicSimilarity], result of:
      0.042364612 = score(doc=952,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.26148933 = fieldWeight in 952, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=952)
  0.2857143 = coord(2/7)
```
Abstract

The potential of summary personalization is high, because a summary that would be useless to decide the relevance of a document if summarized in a generic manner, may be useful if the right sentences are selected that match the user interest. In this paper we defend the use of a personalized summarization facility to maximize the density of relevance of selections sent by a personalized information system to a given user. The personalization is applied to the digital newspaper domain and it used a user-model that stores long and short term interests using four reference systems: sections, categories, keywords and feedback terms. On the other side, it is crucial to measure how much information is lost during the summarization process, and how this information loss may affect the ability of the user to judge the relevance of a given document. The results obtained in two personalization systems show that personalized summaries perform better than generic and generic-personalized summaries in terms of identifying documents that satisfy user preferences. We also considered a user-centred direct evaluation that showed a high level of user satisfaction with the summaries.

Source

Information processing and management. 43(2007) no.6, S.1715-1734
Galgani, F.; Compton, P.; Hoffmann, A.: Summarization based on bi-directional citation analysis (2015) 0.02
```
0.02320403 = product of:
  0.0812141 = sum of:
    0.03718255 = weight(_text_:processing in 2685) [ClassicSimilarity], result of:
      0.03718255 = score(doc=2685,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 2685, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2685)
    0.044031553 = weight(_text_:techniques in 2685) [ClassicSimilarity], result of:
      0.044031553 = score(doc=2685,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.24335694 = fieldWeight in 2685, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2685)
  0.2857143 = coord(2/7)
```
Abstract

Automatic document summarization using citations is based on summarizing what others explicitly say about the document, by extracting a summary from text around the citations (citances). While this technique works quite well for summarizing the impact of scientific articles, other genres of documents as well as other types of summaries require different approaches. In this paper, we introduce a new family of methods that we developed for legal documents summarization to generate catchphrases for legal cases (where catchphrases are a form of legal summary). Our methods use both incoming and outgoing citations, and we show how citances can be combined with other elements of cited and citing documents, including the full text of the target document, and catchphrases of cited and citing cases. On a legal summarization corpus, our methods outperform competitive baselines. The combination of full text sentences and catchphrases from cited and citing cases is particularly successful. We also apply and evaluate the methods on scientific paper summarization, where they perform at the level of state-of-the-art techniques. Our family of citation-based summarization methods is powerful and flexible enough to target successfully a range of different domains and summarization tasks.

Source

Information processing and management. 51(2015) no.1, S.1-24
Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.02
```
0.0175181 = product of:
  0.061313346 = sum of:
    0.04461906 = weight(_text_:processing in 948) [ClassicSimilarity], result of:
      0.04461906 = score(doc=948,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 948, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=948)
    0.016694285 = product of:
      0.03338857 = sum of:
        0.03338857 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.03338857 = score(doc=948,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.

Source

Information processing and management. 43(2007) no.6, S.1606-1618

Craven, T.C.: Presentation of repeated phrases in a computer-assisted abstracting tool kit (2001) 0.01

0.014873021 = product of:
  0.10411114 = sum of:
    0.10411114 = weight(_text_:processing in 3667) [ClassicSimilarity], result of:
      0.10411114 = score(doc=3667,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.6261658 = fieldWeight in 3667, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.109375 = fieldNorm(doc=3667)
  0.14285715 = coord(1/7)

Source: Information processing and management. 37(2001) no.2, S.221-230

Bateman, J.; Teich, E.: Selective information presentation in an integrated publication system : an application of genre-driven text generation (1995) 0.01

0.014873021 = product of:
  0.10411114 = sum of:
    0.10411114 = weight(_text_:processing in 2928) [ClassicSimilarity], result of:
      0.10411114 = score(doc=2928,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.6261658 = fieldWeight in 2928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.109375 = fieldNorm(doc=2928)
  0.14285715 = coord(1/7)

Source: Information processing and management. 31(1995) no.5, S.753-767

Endres-Niggemeyer, B.: SimSum : an empirically founded simulation of summarizing (2000) 0.01

0.014873021 = product of:
  0.10411114 = sum of:
    0.10411114 = weight(_text_:processing in 3343) [ClassicSimilarity], result of:
      0.10411114 = score(doc=3343,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.6261658 = fieldWeight in 3343, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.109375 = fieldNorm(doc=3343)
  0.14285715 = coord(1/7)

Source: Information processing and management. 36(2000) no.4, S.659-682

Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01
```
0.014598417 = product of:
  0.051094458 = sum of:
    0.03718255 = weight(_text_:processing in 1012) [ClassicSimilarity], result of:
      0.03718255 = score(doc=1012,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 1012, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1012)
    0.013911906 = product of:
      0.027823811 = sum of:
        0.027823811 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
          0.027823811 = score(doc=1012,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.19345059 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.

Date

22. 6.2023 14:55:20
Moens, M.-F.; Uyttendaele, C.; Dumotier, J.: Abstracting of legal cases : the potential of clustering based on the selection of representative objects (1999) 0.01
```
0.010674859 = product of:
  0.07472401 = sum of:
    0.07472401 = weight(_text_:techniques in 2944) [ClassicSimilarity], result of:
      0.07472401 = score(doc=2944,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.4129904 = fieldWeight in 2944, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=2944)
  0.14285715 = coord(1/7)
```
Abstract

The SALOMON project automatically summarizes Belgian criminal cases in order to improve access to the large number of existing and future court decisions. SALOMON extracts text units from the case text to form a case summary. Such a case summary facilitates the rapid determination of the relevance of the case or may be employed in text search. an important part of the research concerns the development of techniques for automatic recognition of representative text paragraphs (or sentences) in texts of unrestricted domains. these techniques are employed to eliminate redundant material in the case texts, and to identify informative text paragraphs which are relevant to include in the case summary. An evaluation of a test set of 700 criminal cases demonstrates that the algorithms have an application potential for automatic indexing, abstracting, and text linkage

Johnson, F.C.; Paice, C.D.; Black, W.J.; Neal, A.P.: ¬The application of linguistic processing to automatic abstract generation (1993) 0.01

0.010623586 = product of:
  0.0743651 = sum of:
    0.0743651 = weight(_text_:processing in 2290) [ClassicSimilarity], result of:
      0.0743651 = score(doc=2290,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.4472613 = fieldWeight in 2290, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.078125 = fieldNorm(doc=2290)
  0.14285715 = coord(1/7)

McKeown, K.; Robin, J.; Kukich, K.: Generating concise natural language summaries (1995) 0.01

0.010623586 = product of:
  0.0743651 = sum of:
    0.0743651 = weight(_text_:processing in 2932) [ClassicSimilarity], result of:
      0.0743651 = score(doc=2932,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.4472613 = fieldWeight in 2932, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.078125 = fieldNorm(doc=2932)
  0.14285715 = coord(1/7)

Source: Information processing and management. 31(1995) no.5, S.703-733

Endres-Niggemeyer, B.; Maier, E.; Sigel, A.: How to implement a naturalistic model of abstracting : four core working steps of an expert abstractor (1995) 0.01
```
0.010516814 = product of:
  0.0736177 = sum of:
    0.0736177 = weight(_text_:processing in 2930) [ClassicSimilarity], result of:
      0.0736177 = score(doc=2930,freq=4.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.4427661 = fieldWeight in 2930, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2930)
  0.14285715 = coord(1/7)
```
Abstract

4 working steps taken from a comprehensive empirical model of expert abstracting are studied in order to prepare an explorative implementation of a simulation model. It aims at explaining the knowledge processing activities during professional summarizing. Following the case-based and holistic strategy of qualitative empirical research, the main features of the simulation system were developed by investigating in detail a small but central test case - 4 working steps where an expert abstractor discovers what the paper is about and drafts the topic sentence of the abstract

Source

Information processing and management. 31(1995) no.5, S.631-674
Dorr, B.J.; Gaasterland, T.: Exploiting aspectual features and connecting words for summarization-inspired temporal-relation extraction (2007) 0.01
```
0.009014412 = product of:
  0.06310088 = sum of:
    0.06310088 = weight(_text_:processing in 950) [ClassicSimilarity], result of:
      0.06310088 = score(doc=950,freq=4.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3795138 = fieldWeight in 950, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=950)
  0.14285715 = coord(1/7)
```
Abstract

This paper presents a model that incorporates contemporary theories of tense and aspect and develops a new framework for extracting temporal relations between two sentence-internal events, given their tense, aspect, and a temporal connecting word relating the two events. A linguistic constraint on event combination has been implemented to detect incorrect parser analyses and potentially apply syntactic reanalysis or semantic reinterpretation - in preparation for subsequent processing for multi-document summarization. An important contribution of this work is the extension of two different existing theoretical frameworks - Hornstein's 1990 theory of tense analysis and Allen's 1984 theory on event ordering - and the combination of both into a unified system for representing and constraining combinations of different event types (points, closed intervals, and open-ended intervals). We show that our theoretical results have been verified in a large-scale corpus analysis. The framework is designed to inform a temporally motivated sentence-ordering module in an implemented multi-document summarization system.

Source

Information processing and management. 43(2007) no.6, S.1681-1704

Search (62 results, page 1 of 4)

Authors

Years

Types

Themes