Search (7 results, page 1 of 1)

Ouyang, Y.; Li, W.; Li, S.; Lu, Q.: Intertopic information mining for query-based summarization (2010) 0.01
```
0.0061221616 = product of:
  0.030610807 = sum of:
    0.010176711 = product of:
      0.03053013 = sum of:
        0.03053013 = weight(_text_:problem in 3459) [ClassicSimilarity], result of:
          0.03053013 = score(doc=3459,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.23447686 = fieldWeight in 3459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3459)
      0.33333334 = coord(1/3)
    0.020434096 = product of:
      0.06130229 = sum of:
        0.06130229 = weight(_text_:2010 in 3459) [ClassicSimilarity], result of:
          0.06130229 = score(doc=3459,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.41779095 = fieldWeight in 3459, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3459)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set.

Source

Journal of the American Society for Information Science and Technology. 61(2010) no.5, S.1062-1072

Year

2010

Wang, W.; Hwang, D.: Abstraction Assistant : an automatic text abstraction system (2010) 0.00

0.0024520915 = product of:
  0.024520915 = sum of:
    0.024520915 = product of:
      0.07356274 = sum of:
        0.07356274 = weight(_text_:2010 in 3981) [ClassicSimilarity], result of:
          0.07356274 = score(doc=3981,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.5013491 = fieldWeight in 3981, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.046875 = fieldNorm(doc=3981)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.9, S.1790-1799
Year: 2010

Martinez-Romo, J.; Araujo, L.; Fernandez, A.D.: SemGraph : extracting keyphrases following a novel semantic graph-based approach (2016) 0.00
```
0.0015508389 = product of:
  0.015508389 = sum of:
    0.015508389 = product of:
      0.046525165 = sum of:
        0.046525165 = weight(_text_:2010 in 2832) [ClassicSimilarity], result of:
          0.046525165 = score(doc=2832,freq=2.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.31708103 = fieldWeight in 2832, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.046875 = fieldNorm(doc=2832)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Keyphrases represent the main topics a text is about. In this article, we introduce SemGraph, an unsupervised algorithm for extracting keyphrases from a collection of texts based on a semantic relationship graph. The main novelty of this algorithm is its ability to identify semantic relationships between words whose presence is statistically significant. Our method constructs a co-occurrence graph in which words appearing in the same document are linked, provided their presence in the collection is statistically significant with respect to a null model. Furthermore, the graph obtained is enriched with information from WordNet. We have used the most recent and standardized benchmark to evaluate the system ability to detect the keyphrases that are part of the text. The result is a method that achieves an improvement of 5.3% and 7.28% in F measure over the two labeled sets of keyphrases used in the evaluation of SemEval-2010.
Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.00
```
0.0014392043 = product of:
  0.0143920425 = sum of:
    0.0143920425 = product of:
      0.043176126 = sum of:
        0.043176126 = weight(_text_:problem in 5400) [ClassicSimilarity], result of:
          0.043176126 = score(doc=5400,freq=4.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.33160037 = fieldWeight in 5400, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
Cai, X.; Li, W.: Enhancing sentence-level clustering with integrated and interactive frameworks for theme-based summarization (2011) 0.00
```
0.0010176711 = product of:
  0.010176711 = sum of:
    0.010176711 = product of:
      0.03053013 = sum of:
        0.03053013 = weight(_text_:problem in 4770) [ClassicSimilarity], result of:
          0.03053013 = score(doc=4770,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.23447686 = fieldWeight in 4770, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4770)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Sentence clustering plays a pivotal role in theme-based summarization, which discovers topic themes defined as the clusters of highly related sentences to avoid redundancy and cover more diverse information. As the length of sentences is short and the content it contains is limited, the bag-of-words cosine similarity traditionally used for document clustering is no longer suitable. Special treatment for measuring sentence similarity is necessary. In this article, we study the sentence-level clustering problem. After exploiting concept- and context-enriched sentence vector representations, we develop two co-clustering frameworks to enhance sentence-level clustering for theme-based summarization-integrated clustering and interactive clustering-both allowing word and document to play an explicit role in sentence clustering as independent text objects rather than using word or concept as features of a sentence in a document set. In each framework, we experiment with two-level co-clustering (i.e., sentence-word co-clustering or sentence-document co-clustering) and three-level co-clustering (i.e., document-sentence-word co-clustering). Compared against concept- and context-oriented sentence-representation reformation, co-clustering shows a clear advantage in both intrinsic clustering quality evaluation and extrinsic summarization evaluation conducted on the Document Understanding Conferences (DUC) datasets.
Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.00
```
8.141369E-4 = product of:
  0.008141369 = sum of:
    0.008141369 = product of:
      0.024424106 = sum of:
        0.024424106 = weight(_text_:problem in 5089) [ClassicSimilarity], result of:
          0.024424106 = score(doc=5089,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.1875815 = fieldWeight in 5089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=5089)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users' needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.

Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.00

6.927037E-4 = product of:
  0.0069270367 = sum of:
    0.0069270367 = product of:
      0.02078111 = sum of:
        0.02078111 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
          0.02078111 = score(doc=2640,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.19345059 = fieldWeight in 2640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2640)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Date: 22. 1.2016 12:29:41

Search (7 results, page 1 of 1)

Authors