Search (1 results, page 1 of 1)

  • × author_ss:"Xu, J."
  • × theme_ss:"Verteilte bibliographische Datenbanken"
  1. Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.01
    0.013419857 = product of:
      0.04025957 = sum of:
        0.04025957 = weight(_text_:search in 38) [ClassicSimilarity], result of:
          0.04025957 = score(doc=38,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.230407 = fieldWeight in 38, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=38)
      0.33333334 = coord(1/3)
    
    Abstract
    Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval