Search (47 results, page 1 of 3)

  • × theme_ss:"Automatisches Indexieren"
  1. Kanan, T.; Fox, E.A.: Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy (2016) 0.06
    0.0603554 = product of:
      0.1207108 = sum of:
        0.1207108 = product of:
          0.2414216 = sum of:
            0.2414216 = weight(_text_:news in 3151) [ClassicSimilarity], result of:
              0.2414216 = score(doc=3151,freq=14.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.76611733 = fieldWeight in 3151, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3151)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Arabic news articles in electronic collections are difficult to study. Browsing by category is rarely supported. Although helpful machine-learning methods have been applied successfully to similar situations for English news articles, limited research has been completed to yield suitable solutions for Arabic news. In connection with a Qatar National Research Fund (QNRF)-funded project to build digital library community and infrastructure in Qatar, we developed software for browsing a collection of about 237,000 Arabic news articles, which should be applicable to other Arabic news collections. We designed a simple taxonomy for Arabic news stories that is suitable for the needs of Qatar and other nations, is compatible with the subject codes of the International Press Telecommunications Council, and was enhanced with the aid of a librarian expert as well as five Arabic-speaking volunteers. We developed tailored stemming (i.e., a new Arabic light stemmer called P-Stemmer) and automatic classification methods (the best being binary Support Vector Machines classifiers) to work with the taxonomy. Using evaluation techniques commonly used in the information retrieval community, including 10-fold cross-validation and the Wilcoxon signed-rank test, we showed that our approach to stemming and classification is superior to state-of-the-art techniques.
  2. Alexander, M.: Automatic indexing of document images using Excalibur EFS (1995) 0.04
    0.03649951 = product of:
      0.07299902 = sum of:
        0.07299902 = product of:
          0.14599805 = sum of:
            0.14599805 = weight(_text_:news in 1911) [ClassicSimilarity], result of:
              0.14599805 = score(doc=1911,freq=2.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.4633042 = fieldWeight in 1911, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1911)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Library technology news. 1995, no.16, S.4-8
  3. Dow Jones unveils knowledge indexing system (1997) 0.04
    0.03649951 = product of:
      0.07299902 = sum of:
        0.07299902 = product of:
          0.14599805 = sum of:
            0.14599805 = weight(_text_:news in 751) [ClassicSimilarity], result of:
              0.14599805 = score(doc=751,freq=2.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.4633042 = fieldWeight in 751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0625 = fieldNorm(doc=751)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Dow Jones Interactive Publishing has developed a sophisticated automatic knowledge indexing system that will allow searchers of the Dow Jones News / Retrieval service to get highly targeted results from a search in the service's Publications Library. Instead of relying on a thesaurus of company names, the new system uses a combination of that basic algorithm plus unique rules based on the editorial styles of individual publications in the Library. Dow Jones have also announced its acceptance of the definitions of 'selected full text' and 'full text' from Bibliodata's Fulltext Sources Online directory
  4. Pritchard-Schoch, T.: Natural language comes of age (1993) 0.04
    0.03649951 = product of:
      0.07299902 = sum of:
        0.07299902 = product of:
          0.14599805 = sum of:
            0.14599805 = weight(_text_:news in 2570) [ClassicSimilarity], result of:
              0.14599805 = score(doc=2570,freq=2.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.4633042 = fieldWeight in 2570, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2570)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Discusses natural languages and the natural language implementations of Westlaw's full-text legal documents, Westlaw Is Natural. Natural language is not aritificial intelligence but a hybrid of linguistics, mathematics and statistics. Provides 3 classes of retrieval models. Explains how Westlaw processes an English query. Assesses WIN. Covers WIN enhancements; the natural language features of Congressional Quarterly's Washington Alert using a document for a query; the personal librarian front end search software and Dowquest from Dow Jones news/retrieval. Conmsiders whether natural language encourages fuzzy thinking and whether Boolean logic will still be needed
  5. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03
    0.03258102 = product of:
      0.06516204 = sum of:
        0.06516204 = product of:
          0.13032408 = sum of:
            0.13032408 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.13032408 = score(doc=402,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  6. Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.03
    0.032261316 = product of:
      0.06452263 = sum of:
        0.06452263 = product of:
          0.12904526 = sum of:
            0.12904526 = weight(_text_:news in 3105) [ClassicSimilarity], result of:
              0.12904526 = score(doc=3105,freq=4.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.40950692 = fieldWeight in 3105, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3105)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper explores the possible role of named entities in an automatic index-ing process, based on text in subtitles. This is done by analyzing entity types, name den-sity and name frequencies in subtitles and metadata records from different TV programs. The name density in metadata records is much higher than the name density in subtitles, and named entities with high frequencies in the subtitles are more likely to be mentioned in the metadata records. Personal names, geographical names and names of organizations where the most prominent entity types in both the news subtitles and news metadata, while persons, works and locations are the most prominent in culture programs.
  7. Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.03
    0.028508391 = product of:
      0.057016782 = sum of:
        0.057016782 = product of:
          0.114033565 = sum of:
            0.114033565 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
              0.114033565 = score(doc=262,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.5416616 = fieldWeight in 262, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=262)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20.10.2000 12:22:23
  8. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.03
    0.028508391 = product of:
      0.057016782 = sum of:
        0.057016782 = product of:
          0.114033565 = sum of:
            0.114033565 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.114033565 = score(doc=6265,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  9. Martins, E.F.; Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: On cold start for associative tag recommendation (2016) 0.03
    0.027083963 = product of:
      0.054167926 = sum of:
        0.054167926 = product of:
          0.16250378 = sum of:
            0.16250378 = weight(_text_:objects in 2494) [ClassicSimilarity], result of:
              0.16250378 = score(doc=2494,freq=6.0), product of:
                0.3195352 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060118705 = queryNorm
                0.508563 = fieldWeight in 2494, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2494)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Tag recommendation strategies that exploit term co-occurrence patterns with tags previously assigned to the target object have consistently produced state-of-the-art results. However, such techniques work only for objects with previously assigned tags. Here we focus on tag recommendation for objects with no tags, a variation of the well-known \textit{cold start} problem. We start by evaluating state-of-the-art co-occurrence based methods in cold start. Our results show that the effectiveness of these methods suffers in this situation. Moreover, we show that employing various automatic filtering strategies to generate an initial tag set that enables the use of co-occurrence patterns produces only marginal improvements. We then propose a new approach that exploits both positive and negative user feedback to iteratively select input tags along with a genetic programming strategy to learn the recommendation function. Our experimental results indicate that extending the methods to include user relevance feedback leads to gains in precision of up to 58% over the best baseline in cold start scenarios and gains of up to 43% over the best baseline in objects that contain some initial tags (i.e., no cold start). We also show that our best relevance-feedback-driven strategy performs well even in scenarios that lack user cooperation (i.e., users may refuse to provide feedback) and user reliability (i.e., users may provide the wrong feedback).
  10. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02
    0.024435764 = product of:
      0.04887153 = sum of:
        0.04887153 = product of:
          0.09774306 = sum of:
            0.09774306 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.09774306 = score(doc=58,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:44
  11. Hauer, M.: Automatische Indexierung (2000) 0.02
    0.024435764 = product of:
      0.04887153 = sum of:
        0.04887153 = product of:
          0.09774306 = sum of:
            0.09774306 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
              0.09774306 = score(doc=5887,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.46428138 = fieldWeight in 5887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5887)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
  12. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02
    0.024435764 = product of:
      0.04887153 = sum of:
        0.04887153 = product of:
          0.09774306 = sum of:
            0.09774306 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.09774306 = score(doc=2051,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:56
  13. Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.02
    0.024435764 = product of:
      0.04887153 = sum of:
        0.04887153 = product of:
          0.09774306 = sum of:
            0.09774306 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
              0.09774306 = score(doc=5629,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.46428138 = fieldWeight in 5629, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5629)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    B.I.T.online. 22(2019) H.2, S.163-166
  14. Li, W.; Wong, K.-F.; Yuan, C.: Toward automatic Chinese temporal information extraction (2001) 0.02
    0.022812195 = product of:
      0.04562439 = sum of:
        0.04562439 = product of:
          0.09124878 = sum of:
            0.09124878 = weight(_text_:news in 6029) [ClassicSimilarity], result of:
              0.09124878 = score(doc=6029,freq=2.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.28956512 = fieldWeight in 6029, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6029)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Over the past few years, temporal information processing and temporal database management have increasingly become hot topics. Nevertheless, only a few researchers have investigated these areas in the Chinese language. This lays down the objective of our research: to exploit Chinese language processing techniques for temporal information extraction and concept reasoning. In this article, we first study the mechanism for expressing time in Chinese. On the basis of the study, we then design a general frame structure for maintaining the extracted temporal concepts and propose a system for extracting time-dependent information from Hong Kong financial news. In the system, temporal knowledge is represented by different types of temporal concepts (TTC) and different temporal relations, including absolute and relative relations, which are used to correlate between action times and reference times. In analyzing a sentence, the algorithm first determines the situation related to the verb. This in turn will identify the type of temporal concept associated with the verb. After that, the relevant temporal information is extracted and the temporal relations are derived. These relations link relevant concept frames together in chronological order, which in turn provide the knowledge to fulfill users' queries, e.g., for question-answering (i.e., Q&A) applications
  15. Donath, A.: Flickr sorgt mit Automatik-Tags für Aufregung (2015) 0.02
    0.022812195 = product of:
      0.04562439 = sum of:
        0.04562439 = product of:
          0.09124878 = sum of:
            0.09124878 = weight(_text_:news in 1876) [ClassicSimilarity], result of:
              0.09124878 = score(doc=1876,freq=2.0), product of:
                0.31512353 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.060118705 = queryNorm
                0.28956512 = fieldWeight in 1876, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1876)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    http://www.golem.de/news/unsensible-verschlagwortung-flickr-sorgt-mit-automatik-tags-fuer-aufregung-1505-114202.html
  16. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02
    0.020363137 = product of:
      0.040726274 = sum of:
        0.040726274 = product of:
          0.08145255 = sum of:
            0.08145255 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
              0.08145255 = score(doc=1952,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.38690117 = fieldWeight in 1952, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1952)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    16. 8.1998 12:51:22
  17. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02
    0.020363137 = product of:
      0.040726274 = sum of:
        0.040726274 = product of:
          0.08145255 = sum of:
            0.08145255 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.08145255 = score(doc=4157,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  18. Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.02
    0.020363137 = product of:
      0.040726274 = sum of:
        0.040726274 = product of:
          0.08145255 = sum of:
            0.08145255 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
              0.08145255 = score(doc=374,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.38690117 = fieldWeight in 374, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=374)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 4.2002 10:22:41
  19. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.02
    0.020363137 = product of:
      0.040726274 = sum of:
        0.040726274 = product of:
          0.08145255 = sum of:
            0.08145255 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.08145255 = score(doc=2759,freq=2.0), product of:
                0.21052547 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060118705 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  20. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D.: ¬A picture is worth a thousand (coherent) words : building a natural description of images (2014) 0.02
    0.018958775 = product of:
      0.03791755 = sum of:
        0.03791755 = product of:
          0.11375265 = sum of:
            0.11375265 = weight(_text_:objects in 1874) [ClassicSimilarity], result of:
              0.11375265 = score(doc=1874,freq=6.0), product of:
                0.3195352 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060118705 = queryNorm
                0.3559941 = fieldWeight in 1874, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1874)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Content
    "People can summarize a complex scene in a few words without thinking twice. It's much more difficult for computers. But we've just gotten a bit closer -- we've developed a machine-learning system that can automatically produce captions (like the three above) to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images. Recent research has greatly improved object detection, classification, and labeling. But accurately describing a complex scene requires a deeper representation of what's going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. But what if we instead merged recent computer vision and language models into a single jointly trained system, taking an image and directly producing a human readable sequence of words to describe it? This idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German. Now, what if we replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images? Normally, the CNN's last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But if we remove that final layer, we can instead feed the CNN's rich encoding of the image into a RNN designed to produce phrases. We can then train the whole system directly on images and their captions, so it maximizes the likelihood that descriptions it produces best match the training descriptions for each image.

Years

Languages

  • e 30
  • d 16
  • ru 1
  • More… Less…

Types

  • a 40
  • el 6
  • x 2
  • m 1
  • s 1
  • More… Less…