Search (1 results, page 1 of 1)

Did you mean:
rvk_ss%3a%2200 80852 allgemeines %2f buch- und bibliothekswesen%2c informationswissenschaft %2f bibliothekswesen %2f bibliothekswesen in einzelnen L%c3%aendern und einzelne bibliotheken %2f einzelne deutsche bibliotheken %2f bibliotheken S %2f stuttgart %2f university%c3%A4tsbibliothek hohenheim%22 1
rvk_ss%3a%2200 80852 allgemeines %2f buch- und bibliothekswesen%2c informationswissenschaft %2f bibliothekswesen %2f bibliothekswesen in einzelnen L%c3%aendern und einzelne bibliotheken %2f einzelne deutsche bibliotheken %2f bibliotheken S %2f stuttgart %2f université%c3%A4tsbibliothek hohenheim%22 1
rvk_ss%3a%2200 80852 allgemeines %2f buch- und bibliothekswesen%2c informationswissenschaft %2f bibliothekswesen %2f bibliothekswesen in einzelnen L%c3%aendern und einzelne bibliotheken %2f einzelnen deutsche bibliotheken %2f bibliotheken S %2f stuttgart %2f university%c3%A4tsbibliothek hohenheim%22 1
rvk_ss%3a%2200 80852 allgemeines %2f buch- und bibliothekswesen%2c informationswissenschaft %2f bibliothekswesen %2f bibliothekswesen in einzelnen L%c3%aendern und einzelnen bibliotheken %2f einzelne deutsche bibliotheken %2f bibliotheken S %2f stuttgart %2f university%c3%A4tsbibliothek hohenheim%22 1
rvk_ss%3a%2200 80852 allgemeines %2f buch- und bibliothekswesen%2c informationswissenschaft %2f bibliothekswesen %2f bibliothekswesen in einzelnen L%c3%andern und einzelne bibliotheken %2f einzelne deutsche bibliotheken %2f bibliotheken S %2f stuttgart %2f university%c3%A4tsbibliothek hohenheim%22 1

Mao, J.; Xu, W.; Yang, Y.; Wang, J.; Yuille, A.L.: Explain images with multimodal recurrent neural networks (2014) 0.00
```
1.5275023E-4 = product of:
  0.0045825066 = sum of:
    0.0045825066 = weight(_text_:in in 1557) [ClassicSimilarity], result of:
      0.0045825066 = score(doc=1557,freq=6.0), product of:
        0.029340398 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.021569785 = queryNorm
        0.1561842 = fieldWeight in 1557, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=1557)
  0.033333335 = coord(1/30)
```
Abstract

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images. It directly models the probability distribution of generating a word given previous words and the image. Image descriptions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on three benchmark datasets (IAPR TC-12 [8], Flickr 8K [28], and Flickr 30K [13]). Our model outperforms the state-of-the-art generative method. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.