Search (2 results, page 1 of 1)

  • × author_ss:"Srihari, R.K."
  1. Srihari, R.K.: Using speech input for image interpretation, annotation, and retrieval (1997) 0.02
    0.022972263 = product of:
      0.06891679 = sum of:
        0.06891679 = product of:
          0.10337518 = sum of:
            0.06189794 = weight(_text_:retrieval in 764) [ClassicSimilarity], result of:
              0.06189794 = score(doc=764,freq=8.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.40105087 = fieldWeight in 764, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=764)
            0.04147724 = weight(_text_:22 in 764) [ClassicSimilarity], result of:
              0.04147724 = score(doc=764,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.23214069 = fieldWeight in 764, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=764)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Explores the interaction of textual and photographic information in an integrated text and image database environment and describes 3 different applications involving the exploitation of linguistic context in vision. Describes the practical application of these ideas in working systems. PICTION uses captions to identify human faces in a photograph, wile Show&Tell is a multimedia system for semi automatic image annotation. The system combines advances in speech recognition, natural language processing and image understanding to assist in image annotation and enhance image retrieval capabilities. Presents an extension of this work to video annotation and retrieval
    Date
    22. 9.1997 19:16:05
    Source
    Digital image access and retrieval: Proceedings of the 1996 Clinic on Library Applications of Data Processing, 24-26 Mar 1996. Ed.: P.B. Heidorn u. B. Sandore
  2. Srihari, R.K.; Zhang, Z.: Exploiting multimodal context in image retrieval (1999) 0.01
    0.006877549 = product of:
      0.020632647 = sum of:
        0.020632647 = product of:
          0.06189794 = sum of:
            0.06189794 = weight(_text_:retrieval in 848) [ClassicSimilarity], result of:
              0.06189794 = score(doc=848,freq=8.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.40105087 = fieldWeight in 848, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=848)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    This research explores the interaction of textual and photographic information in multimodal documents. The World Wide Web (WWW) may be viewed as the ultimate, large-scale, dynamically changing, multimedia database. Finding useful information from the WWW without encountering numerous false positives (the current case) poses a challenge to multimedia information retrieval systems (MMIR). The fact that images do not appear in isolation, but rather with accompanying collateral text, is exploited. Taken independently, existing techniques for picture retrieval using collateral text-based methods and image-based methods have several limitations. Text-based methods, while very powerful in matching context, do not have access to image content. Image-based methods compute general similarity between images and provide limited semantics. This research focuses on improving precision and recall in an MMIR system by interactively combining text processing with image processing (IP) in both the indexing and retrieval phases. A picture search engine is demonstrated as an application.