Search (23 results, page 1 of 2)

Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01

0.0053709024 = product of:
  0.02148361 = sum of:
    0.014460463 = product of:
      0.04338139 = sum of:
        0.04338139 = weight(_text_:problem in 5400) [ClassicSimilarity], result of:
          0.04338139 = score(doc=5400,freq=4.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.33160037 = fieldWeight in 5400, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.33333334 = coord(1/3)
    0.007023146 = product of:
      0.021069437 = sum of:
        0.021069437 = weight(_text_:29 in 5400) [ClassicSimilarity], result of:
          0.021069437 = score(doc=5400,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.19432661 = fieldWeight in 5400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.33333334 = coord(1/3)
  0.25 = coord(2/8)

Abstract: Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
Date: 29. 9.2019 12:18:42

Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.00

0.0034957787 = product of:
  0.02796623 = sum of:
    0.02796623 = product of:
      0.041949343 = sum of:
        0.021069437 = weight(_text_:29 in 2640) [ClassicSimilarity], result of:
          0.021069437 = score(doc=2640,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.19432661 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
        0.020879906 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
          0.020879906 = score(doc=2640,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.19345059 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
      0.6666667 = coord(2/3)
  0.125 = coord(1/8)

Date: 22. 1.2016 12:29:41

Pinto, M.: Engineering the production of meta-information : the abstracting concern (2003) 0.00

0.0034762803 = product of:
  0.027810242 = sum of:
    0.027810242 = product of:
      0.08343072 = sum of:
        0.08343072 = weight(_text_:29 in 4667) [ClassicSimilarity], result of:
          0.08343072 = score(doc=4667,freq=4.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.7694941 = fieldWeight in 4667, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.109375 = fieldNorm(doc=4667)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 27.11.2005 18:29:55
Source: Journal of information science. 29(2003) no.5, S.405-418

Ercan, G.; Cicekli, I.: Using lexical chains for keyword extraction (2007) 0.00
```
0.0025305813 = product of:
  0.02024465 = sum of:
    0.02024465 = product of:
      0.06073395 = sum of:
        0.06073395 = weight(_text_:problem in 951) [ClassicSimilarity], result of:
          0.06073395 = score(doc=951,freq=4.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.46424055 = fieldWeight in 951, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=951)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Keywords can be considered as condensed versions of documents and short forms of their summaries. In this paper, the problem of automatic extraction of keywords from documents is treated as a supervised learning task. A lexical chain holds a set of semantically related words of a text and it can be said that a lexical chain represents the semantic content of a portion of the text. Although lexical chains have been extensively used in text summarization, their usage for keyword extraction problem has not been fully investigated. In this paper, a keyword extraction technique that uses lexical chains is described, and encouraging results are obtained.
Hirao, T.; Okumura, M.; Yasuda, N.; Isozaki, H.: Supervised automatic evaluation for summarization with voted regression model (2007) 0.00
```
0.0021690698 = product of:
  0.017352559 = sum of:
    0.017352559 = product of:
      0.052057672 = sum of:
        0.052057672 = weight(_text_:problem in 942) [ClassicSimilarity], result of:
          0.052057672 = score(doc=942,freq=4.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.39792046 = fieldWeight in 942, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=942)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

The high quality evaluation of generated summaries is needed if we are to improve automatic summarization systems. Although human evaluation provides better results than automatic evaluation methods, its cost is huge and it is difficult to reproduce the results. Therefore, we need an automatic method that simulates human evaluation if we are to improve our summarization system efficiently. Although automatic evaluation methods have been proposed, they are unreliable when used for individual summaries. To solve this problem, we propose a supervised automatic evaluation method based on a new regression model called the voted regression model (VRM). VRM has two characteristics: (1) model selection based on 'corrected AIC' to avoid multicollinearity, (2) voting by the selected models to alleviate the problem of overfitting. Evaluation results obtained for TSC3 and DUC2004 show that our method achieved error reductions of about 17-51% compared with conventional automatic evaluation methods. Moreover, our method obtained the highest correlation coefficients in several different experiments.
Ruda, S.: Abstracting: eine Auswahlbibliographie (1992) 0.00
```
0.001789391 = product of:
  0.014315128 = sum of:
    0.014315128 = product of:
      0.042945385 = sum of:
        0.042945385 = weight(_text_:problem in 6603) [ClassicSimilarity], result of:
          0.042945385 = score(doc=6603,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.3282676 = fieldWeight in 6603, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6603)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Die vorliegende Auswahlbibliographie ist in 9 Themenbereiche unterteilt. Der erste Abschnitt enthält Literatur, in der auf Abstracts und Abstracting-Verfahren allgemein eingegangen und ein Überblick über den Stand der Forschung gegeben wird. Im nächsten Abschnitt werden solche Aufsätze referiert, die die historische Entwicklung des Abstracting beschreiben. Im dritten Teil sind Abstracting-Richtlinien verschiedener Institutionen aufgelistet. Lexikalische, syntaktische und semantische Textkondensierungsverfahren sind das Thema der in Abschnitt 4 präsentierten Arbeiten. Textstrukturen von Abstracts werden unter Punkt 5 betrachtet, und die Arbeiten des nächsten Themenbereiches befassen sich mit dem Problem des Schreibens von Abstracts. Der siebte Abschnitt listet sog. 'maschinelle' und maschinen-unterstützte Abstracting-Methoden auf. Anschließend werden 'maschinelle' und maschinenunterstützte Abstracting-Verfahren, Abstracts im Vergleich zu ihren Primärtexten sowie Abstracts im allgemeien bewertet. Den Abschluß bilden Bibliographien
Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.00
```
0.001789391 = product of:
  0.014315128 = sum of:
    0.014315128 = product of:
      0.042945385 = sum of:
        0.042945385 = weight(_text_:problem in 944) [ClassicSimilarity], result of:
          0.042945385 = score(doc=944,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.3282676 = fieldWeight in 944, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=944)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.

Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.00

0.0017557865 = product of:
  0.014046292 = sum of:
    0.014046292 = product of:
      0.042138875 = sum of:
        0.042138875 = weight(_text_:29 in 1949) [ClassicSimilarity], result of:
          0.042138875 = score(doc=1949,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.38865322 = fieldWeight in 1949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 16. 8.1998 12:30:29

Ye, S.; Chua, T.-S.; Kan, M.-Y.; Qiu, L.: Document concept lattice for text understanding and summarization (2007) 0.00
```
0.0015337638 = product of:
  0.012270111 = sum of:
    0.012270111 = product of:
      0.03681033 = sum of:
        0.03681033 = weight(_text_:problem in 941) [ClassicSimilarity], result of:
          0.03681033 = score(doc=941,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.28137225 = fieldWeight in 941, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=941)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

We argue that the quality of a summary can be evaluated based on how many concepts in the original document(s) that can be preserved after summarization. Here, a concept refers to an abstract or concrete entity or its action often expressed by diverse terms in text. Summary generation can thus be considered as an optimization problem of selecting a set of sentences with minimal answer loss. In this paper, we propose a document concept lattice that indexes the hierarchy of local topics tied to a set of frequent concepts and the corresponding sentences containing these topics. The local topics will specify the promising sub-spaces related to the selected concepts and sentences. Based on this lattice, the summary is an optimized selection of a set of distinct and salient local topics that lead to maximal coverage of concepts with the given number of sentences. Our summarizer based on the concept lattice has demonstrated competitive performance in Document Understanding Conference 2005 and 2006 evaluations as well as follow-on tests.

Craven, T.C.: ¬A phrase flipper for the assistance of writers of abstracts and other text (1995) 0.00

0.0014046291 = product of:
  0.011237033 = sum of:
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 4897) [ClassicSimilarity], result of:
          0.033711098 = score(doc=4897,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 4897, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=4897)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 17. 8.1996 10:29:59

Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.00

0.0013919937 = product of:
  0.01113595 = sum of:
    0.01113595 = product of:
      0.03340785 = sum of:
        0.03340785 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
          0.03340785 = score(doc=6599,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.30952093 = fieldWeight in 6599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6599)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 26. 2.1997 10:22:43

Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.00

0.0013919937 = product of:
  0.01113595 = sum of:
    0.01113595 = product of:
      0.03340785 = sum of:
        0.03340785 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
          0.03340785 = score(doc=6751,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.30952093 = fieldWeight in 6751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6751)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 6. 3.1997 16:22:15

Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.00

0.0013919937 = product of:
  0.01113595 = sum of:
    0.01113595 = product of:
      0.03340785 = sum of:
        0.03340785 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
          0.03340785 = score(doc=6974,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.30952093 = fieldWeight in 6974, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6974)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Ouyang, Y.; Li, W.; Li, S.; Lu, Q.: Intertopic information mining for query-based summarization (2010) 0.00
```
0.0012781365 = product of:
  0.010225092 = sum of:
    0.010225092 = product of:
      0.030675275 = sum of:
        0.030675275 = weight(_text_:problem in 3459) [ClassicSimilarity], result of:
          0.030675275 = score(doc=3459,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.23447686 = fieldWeight in 3459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3459)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set.
Cai, X.; Li, W.: Enhancing sentence-level clustering with integrated and interactive frameworks for theme-based summarization (2011) 0.00
```
0.0012781365 = product of:
  0.010225092 = sum of:
    0.010225092 = product of:
      0.030675275 = sum of:
        0.030675275 = weight(_text_:problem in 4770) [ClassicSimilarity], result of:
          0.030675275 = score(doc=4770,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.23447686 = fieldWeight in 4770, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4770)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Sentence clustering plays a pivotal role in theme-based summarization, which discovers topic themes defined as the clusters of highly related sentences to avoid redundancy and cover more diverse information. As the length of sentences is short and the content it contains is limited, the bag-of-words cosine similarity traditionally used for document clustering is no longer suitable. Special treatment for measuring sentence similarity is necessary. In this article, we study the sentence-level clustering problem. After exploiting concept- and context-enriched sentence vector representations, we develop two co-clustering frameworks to enhance sentence-level clustering for theme-based summarization-integrated clustering and interactive clustering-both allowing word and document to play an explicit role in sentence clustering as independent text objects rather than using word or concept as features of a sentence in a document set. In each framework, we experiment with two-level co-clustering (i.e., sentence-word co-clustering or sentence-document co-clustering) and three-level co-clustering (i.e., document-sentence-word co-clustering). Compared against concept- and context-oriented sentence-representation reformation, co-clustering shows a clear advantage in both intrinsic clustering quality evaluation and extrinsic summarization evaluation conducted on the Document Understanding Conferences (DUC) datasets.

Uyttendaele, C.; Moens, M.-F.; Dumortier, J.: SALOMON: automatic abstracting of legal cases for effective access to court decisions (1998) 0.00

0.0012290506 = product of:
  0.009832405 = sum of:
    0.009832405 = product of:
      0.029497212 = sum of:
        0.029497212 = weight(_text_:29 in 495) [ClassicSimilarity], result of:
          0.029497212 = score(doc=495,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.27205724 = fieldWeight in 495, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=495)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 17. 7.1996 14:16:29

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.00
```
0.0010439953 = product of:
  0.008351962 = sum of:
    0.008351962 = product of:
      0.025055885 = sum of:
        0.025055885 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.025055885 = score(doc=948,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.00
```
0.0010225092 = product of:
  0.008180073 = sum of:
    0.008180073 = product of:
      0.02454022 = sum of:
        0.02454022 = weight(_text_:problem in 5089) [ClassicSimilarity], result of:
          0.02454022 = score(doc=5089,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.1875815 = fieldWeight in 5089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=5089)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users' needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.

Sweeney, S.; Crestani, F.; Losada, D.E.: 'Show me more' : incremental length summarisation using novelty detection (2008) 0.00

8.7789324E-4 = product of:
  0.007023146 = sum of:
    0.007023146 = product of:
      0.021069437 = sum of:
        0.021069437 = weight(_text_:29 in 2054) [ClassicSimilarity], result of:
          0.021069437 = score(doc=2054,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.19432661 = fieldWeight in 2054, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2054)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 29. 7.2008 19:35:12

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.00

8.699961E-4 = product of:
  0.0069599687 = sum of:
    0.0069599687 = product of:
      0.020879906 = sum of:
        0.020879906 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.020879906 = score(doc=5290,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 22. 7.2006 17:25:48

Search (23 results, page 1 of 2)

Authors

Years

Languages

Themes