Gupta, P.; Banchs, R.E.; Rosso, P.: Continuous space models for CLIR (2017)
0.00
0.0015058789 = product of:
0.008282334 = sum of:
0.0061991126 = weight(_text_:a in 3295) [ClassicSimilarity], result of:
0.0061991126 = score(doc=3295,freq=14.0), product of:
0.030653298 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.026584605 = queryNorm
0.20223314 = fieldWeight in 3295, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=3295)
0.0020832212 = weight(_text_:s in 3295) [ClassicSimilarity], result of:
0.0020832212 = score(doc=3295,freq=2.0), product of:
0.028903782 = queryWeight, product of:
1.0872376 = idf(docFreq=40523, maxDocs=44218)
0.026584605 = queryNorm
0.072074346 = fieldWeight in 3295, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
1.0872376 = idf(docFreq=40523, maxDocs=44218)
0.046875 = fieldNorm(doc=3295)
0.18181819 = coord(2/11)
- Abstract
- We present and evaluate a novel technique for learning cross-lingual continuous space models to aid cross-language information retrieval (CLIR). Our model, which is referred to as external-data composition neural network (XCNN), is based on a composition function that is implemented on top of a deep neural network that provides a distributed learning framework. Different from most existing models, which rely only on available parallel data for training, our learning framework provides a natural way to exploit monolingual data and its associated relevance metadata for learning continuous space representations of language. Cross-language extensions of the obtained models can then be trained by using a small set of parallel data. This property is very helpful for resource-poor languages, therefore, we carry out experiments on the English-Hindi language pair. On the conducted comparative evaluation, the proposed model is shown to outperform state-of-the-art continuous space models with statistically significant margin on two different tasks: parallel sentence retrieval and ad-hoc retrieval.
- Source
- Information processing and management. 53(2017) no.2, S.359-370
- Type
- a