Search (1 results, page 1 of 1)

  • × author_ss:"Baeza-Yates, R."
  • × theme_ss:"Retrievalalgorithmen"
  1. Baeza-Yates, R.; Navarro, G.: Block addressing indices for approximate text retrieval (2000) 0.02
    0.018978544 = product of:
      0.056935627 = sum of:
        0.056935627 = weight(_text_:search in 4295) [ClassicSimilarity], result of:
          0.056935627 = score(doc=4295,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3258447 = fieldWeight in 4295, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=4295)
      0.33333334 = coord(1/3)
    
    Abstract
    The issue of reducing the space overhead when indexing large text databases is becoming more and more important, as the text collection grow in size. Another subject, which is gaining importance as text databases grow and get more heterogeneous and error prone, is that of flexible string matching. One of the best tools to make the search more flexible is to allow a limited number of differences between the words found and those sought. This is called 'approximate text searching'. which is becoming more and more popular. In recent years some indexing schemes with very low space overhead have appeared, some of them dealing with approximate searching. These low overhead indices (whose most notorious exponent is Glimpse) are modified inverted files, where space is saved by making the lists of occurences point to text blocks instead of exact word positions. Despite their existence, little is known about the expected behaviour of these 'block addressing' indices, and even less is known when it comes to cope with approximate search. Our main contribution is an analytical study of the space-time trade-offs for indexed text searching