Search (3 results, page 1 of 1)

  • × theme_ss:"Retrievalalgorithmen"
  • × type_ss:"s"
  1. Computational information retrieval (2001) 0.01
    0.011375135 = product of:
      0.034125403 = sum of:
        0.034125403 = product of:
          0.068250805 = sum of:
            0.068250805 = weight(_text_:indexing in 4167) [ClassicSimilarity], result of:
              0.068250805 = score(doc=4167,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.3588626 = fieldWeight in 4167, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4167)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    This volume contains selected papers that focus on the use of linear algebra, computational statistics, and computer science in the development of algorithms and software systems for text retrieval. Experts in information modeling and retrieval share their perspectives on the design of scalable but precise text retrieval systems, revealing many of the challenges and obstacles that mathematical and statistical models must overcome to be viable for automated text processing. This very useful proceedings is an excellent companion for courses in information retrieval, applied linear algebra, and applied statistics. Computational Information Retrieval provides background material on vector space models for text retrieval that applied mathematicians, statisticians, and computer scientists may not be familiar with. For graduate students in these areas, several research questions in information modeling are exposed. In addition, several case studies concerning the efficacy of the popular Latent Semantic Analysis (or Indexing) approach are provided.
    Object
    Latent Semantic Indexing
  2. Information retrieval : data structures and algorithms (1992) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 3495) [ClassicSimilarity], result of:
              0.04021717 = score(doc=3495,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 3495, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3495)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Content
    An edited volume containing data structures and algorithms for information retrieval including a disk with examples written in C. for prgrammers and students interested in parsing text, automated indexing, its the first collection in book form of the basic data structures and algorithms that are critical to the storage and retrieval of documents. ------------------Enthält die Kapitel: FRAKES, W.B.: Introduction to information storage and retrieval systems; BAEZA-YATES, R.S.: Introduction to data structures and algorithms related to information retrieval; HARMAN, D. u.a.: Inverted files; FALOUTSOS, C.: Signature files; GONNET, G.H. u.a.: New indices for text: PAT trees and PAT arrays; FORD, D.A. u. S. CHRISTODOULAKIS: File organizations for optical disks; FOX, C.: Lexical analysis and stoplists; FRAKES, W.B.: Stemming algorithms; SRINIVASAN, P.: Thesaurus construction; BAEZA-YATES, R.A.: String searching algorithms; HARMAN, D.: Relevance feedback and other query modification techniques; WARTIK, S.: Boolean operators; WARTIK, S. u.a.: Hashing algorithms; HARMAN, D.: Ranking algorithms; FOX, E.: u.a.: Extended Boolean models; RASMUSSEN, E.: Clustering algorithms; HOLLAAR, L.: Special-purpose hardware for information retrieval; STANFILL, C.: Parallel information retrieval algorithms
  3. Cross-language information retrieval (1998) 0.00
    0.0047396393 = product of:
      0.014218917 = sum of:
        0.014218917 = product of:
          0.028437834 = sum of:
            0.028437834 = weight(_text_:indexing in 6299) [ClassicSimilarity], result of:
              0.028437834 = score(doc=6299,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.14952609 = fieldWeight in 6299, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=6299)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Content
    Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness
    Footnote
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.