-
Mao, J.; Xu, W.; Yang, Y.; Wang, J.; Yuille, A.L.: Explain images with multimodal recurrent neural networks (2014)
0.01
0.008144768 = product of:
0.016289536 = sum of:
0.016289536 = product of:
0.03257907 = sum of:
0.03257907 = weight(_text_:m in 1557) [ClassicSimilarity], result of:
0.03257907 = score(doc=1557,freq=6.0), product of:
0.114023164 = queryWeight, product of:
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.045820985 = queryNorm
0.28572327 = fieldWeight in 1557, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.046875 = fieldNorm(doc=1557)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images. It directly models the probability distribution of generating a word given previous words and the image. Image descriptions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on three benchmark datasets (IAPR TC-12 [8], Flickr 8K [28], and Flickr 30K [13]). Our model outperforms the state-of-the-art generative method. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.
-
Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015)
0.00
0.0047023837 = product of:
0.009404767 = sum of:
0.009404767 = product of:
0.018809535 = sum of:
0.018809535 = weight(_text_:m in 2144) [ClassicSimilarity], result of:
0.018809535 = score(doc=2144,freq=2.0), product of:
0.114023164 = queryWeight, product of:
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.045820985 = queryNorm
0.1649624 = fieldWeight in 2144, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.046875 = fieldNorm(doc=2144)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Donahue, J.; Hendricks, L.A.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description (2014)
0.00
0.003918653 = product of:
0.007837306 = sum of:
0.007837306 = product of:
0.015674612 = sum of:
0.015674612 = weight(_text_:m in 1873) [ClassicSimilarity], result of:
0.015674612 = score(doc=1873,freq=2.0), product of:
0.114023164 = queryWeight, product of:
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.045820985 = queryNorm
0.13746867 = fieldWeight in 1873, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.0390625 = fieldNorm(doc=1873)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints
0.00
0.003918653 = product of:
0.007837306 = sum of:
0.007837306 = product of:
0.015674612 = sum of:
0.015674612 = weight(_text_:m in 4309) [ClassicSimilarity], result of:
0.015674612 = score(doc=4309,freq=2.0), product of:
0.114023164 = queryWeight, product of:
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.045820985 = queryNorm
0.13746867 = fieldWeight in 4309, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.4884486 = idf(docFreq=9980, maxDocs=44218)
0.0390625 = fieldNorm(doc=4309)
0.5 = coord(1/2)
0.5 = coord(1/2)