Vilares, J.; Alonso, M.A.; Doval, Y.; Vilares, M.: Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval (2016)
0.00
0.0024970302 = product of:
0.009988121 = sum of:
0.009988121 = product of:
0.039952483 = sum of:
0.039952483 = weight(_text_:based in 2974) [ClassicSimilarity], result of:
0.039952483 = score(doc=2974,freq=4.0), product of:
0.14144066 = queryWeight, product of:
3.0129938 = idf(docFreq=5906, maxDocs=44218)
0.04694356 = queryNorm
0.28246817 = fieldWeight in 2974, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
3.0129938 = idf(docFreq=5906, maxDocs=44218)
0.046875 = fieldNorm(doc=2974)
0.25 = coord(1/4)
0.25 = coord(1/4)
- Abstract
- General graph random walk has been successfully applied in multi-document summarization, but it has some limitations to process documents by this way. In this paper, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distribution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.