Search (3 results, page 1 of 1)

  • × author_ss:"Yang, Y."
  1. Mao, J.; Xu, W.; Yang, Y.; Wang, J.; Yuille, A.L.: Explain images with multimodal recurrent neural networks (2014) 0.03
    0.029296497 = product of:
      0.058592994 = sum of:
        0.058592994 = product of:
          0.11718599 = sum of:
            0.11718599 = weight(_text_:network in 1557) [ClassicSimilarity], result of:
              0.11718599 = score(doc=1557,freq=6.0), product of:
                0.22917621 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.05146125 = queryNorm
                0.51133573 = fieldWeight in 1557, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1557)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images. It directly models the probability distribution of generating a word given previous words and the image. Image descriptions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on three benchmark datasets (IAPR TC-12 [8], Flickr 8K [28], and Flickr 30K [13]). Our model outperforms the state-of-the-art generative method. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.
  2. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.02
    0.019733394 = product of:
      0.039466787 = sum of:
        0.039466787 = product of:
          0.078933574 = sum of:
            0.078933574 = weight(_text_:network in 3386) [ClassicSimilarity], result of:
              0.078933574 = score(doc=3386,freq=2.0), product of:
                0.22917621 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.05146125 = queryNorm
                0.3444231 = fieldWeight in 3386, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3386)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper reports a controlled study with statistical significance tests an five text categorization methods: the Support Vector Machines (SVM), a k-Nearest Neighbor (kNN) classifier, a neural network (NNet) approach, the Linear Leastsquares Fit (LLSF) mapping and a Naive Bayes (NB) classifier. We focus an the robustness of these methods in dealing with a skewed category distribution, and their performance as function of the training-set category frequency. Our results show that SVM, kNN and LLSF significantly outperform NNet and NB when the number of positive training instances per category are small (less than ten, and that all the methods perform comparably when the categories are sufficiently common (over 300 instances).
  3. Yang, Y.; Wilbur, J.: Using corpus statistics to remove redundant words in text categorization (1996) 0.02
    0.016914338 = product of:
      0.033828676 = sum of:
        0.033828676 = product of:
          0.06765735 = sum of:
            0.06765735 = weight(_text_:network in 4199) [ClassicSimilarity], result of:
              0.06765735 = score(doc=4199,freq=2.0), product of:
                0.22917621 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.05146125 = queryNorm
                0.29521978 = fieldWeight in 4199, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4199)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article studies aggressive word removal in text categorization to reduce the noice in free texts to enhance the computational efficiency of categorization. We use a novel stop word identification method to automatically generate domain specific stoplists which are much larger than a conventional domain-independent stoplist. In our tests with 3 categorization methods on text collections from different domains/applications, significant numbers of words were removed without sacrificing categorization effectiveness. In the test of the Expert Network method on CACM documents, for example, an 87% removal of unique qords reduced the vocabulary of documents from 8.002 distinct words to 1.045 words, which resulted in a 63% time savings and a 74% memory savings in the computation of category ranking, with a 10% precision improvement on average over not using word removal. It is evident in this study that automated word removal based on corpus statistics has a practical and significant impact on the computational tractability of categorization methods in large databases