Search (5 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × type_ss:"el"
  1. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D.: Show and tell : a neural image caption generator (2014) 0.00
    0.0012781365 = product of:
      0.010225092 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 1869) [ClassicSimilarity], result of:
              0.030675275 = score(doc=1869,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 1869, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1869)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. For instance, while the current state-of-the-art BLEU-1 score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69. We also show BLEU-1 score improvements on Flickr30k, from 56 to 66, and on SBU, from 19 to 28. Lastly, on the newly released COCO dataset, we achieve a BLEU-4 of 27.7, which is the current state-of-the-art.
  2. Schöneberg, U.; Gödert, W.: Erschließung mathematischer Publikationen mittels linguistischer Verfahren (2012) 0.00
    0.0010534719 = product of:
      0.008427775 = sum of:
        0.008427775 = product of:
          0.025283325 = sum of:
            0.025283325 = weight(_text_:29 in 1055) [ClassicSimilarity], result of:
              0.025283325 = score(doc=1055,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23319192 = fieldWeight in 1055, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1055)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    12. 9.2013 12:29:05
  3. Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.00
    0.0010534719 = product of:
      0.008427775 = sum of:
        0.008427775 = product of:
          0.025283325 = sum of:
            0.025283325 = weight(_text_:29 in 2144) [ClassicSimilarity], result of:
              0.025283325 = score(doc=2144,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23319192 = fieldWeight in 2144, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2144)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Source
    Code4Lib journal. Issue 29(2015), [http://journal.code4lib.org/issues/issues/issue29]
  4. Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.00
    8.699961E-4 = product of:
      0.0069599687 = sum of:
        0.0069599687 = product of:
          0.020879906 = sum of:
            0.020879906 = weight(_text_:22 in 3780) [ClassicSimilarity], result of:
              0.020879906 = score(doc=3780,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.19345059 = fieldWeight in 3780, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3780)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    19. 8.2017 9:24:22
  5. Markoff, J.: Researchers announce advance in image-recognition software (2014) 0.00
    6.3906825E-4 = product of:
      0.005112546 = sum of:
        0.005112546 = product of:
          0.015337638 = sum of:
            0.015337638 = weight(_text_:problem in 1875) [ClassicSimilarity], result of:
              0.015337638 = score(doc=1875,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.11723843 = fieldWeight in 1875, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1875)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Content
    In the longer term, the new research may lead to technology that helps the blind and robots navigate natural environments. But it also raises chilling possibilities for surveillance. During the past 15 years, video cameras have been placed in a vast number of public and private spaces. In the future, the software operating the cameras will not only be able to identify particular humans via facial recognition, experts say, but also identify certain types of behavior, perhaps even automatically alerting authorities. Two years ago Google researchers created image-recognition software and presented it with 10 million images taken from YouTube videos. Without human guidance, the program trained itself to recognize cats - a testament to the number of cat videos on YouTube. Current artificial intelligence programs in new cars already can identify pedestrians and bicyclists from cameras positioned atop the windshield and can stop the car automatically if the driver does not take action to avoid a collision. But "just single object recognition is not very beneficial," said Ali Farhadi, a computer scientist at the University of Washington who has published research on software that generates sentences from digital pictures. "We've focused on objects, and we've ignored verbs," he said, adding that these programs do not grasp what is going on in an image. Both the Google and Stanford groups tackled the problem by refining software programs known as neural networks, inspired by our understanding of how the brain works. Neural networks can "train" themselves to discover similarities and patterns in data, even when their human creators do not know the patterns exist.

Languages

Types