Search (7 results, page 1 of 1)

Croft, W.B.; Turtle, H.R.: Retrieval strategies for hypertext (1993) 0.00

0.0028092582 = product of:
  0.022474065 = sum of:
    0.022474065 = product of:
      0.067422196 = sum of:
        0.067422196 = weight(_text_:29 in 4711) [ClassicSimilarity], result of:
          0.067422196 = score(doc=4711,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.6218451 = fieldWeight in 4711, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.125 = fieldNorm(doc=4711)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information processing and management. 29(1993) no.3, S.313-324

Belkin, N.J.; Croft, W.B.: Retrieval techniques (1987) 0.00

0.0027839874 = product of:
  0.0222719 = sum of:
    0.0222719 = product of:
      0.0668157 = sum of:
        0.0668157 = weight(_text_:22 in 334) [ClassicSimilarity], result of:
          0.0668157 = score(doc=334,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.61904186 = fieldWeight in 334, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=334)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Annual review of information science and technology. 22(1987), S.109-145

Belkin, N.J.; Croft, W.B.: Information filtering and information retrieval : two sides of the same coin? (1992) 0.00

0.0021069439 = product of:
  0.01685555 = sum of:
    0.01685555 = product of:
      0.05056665 = sum of:
        0.05056665 = weight(_text_:29 in 6093) [ClassicSimilarity], result of:
          0.05056665 = score(doc=6093,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.46638384 = fieldWeight in 6093, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=6093)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Communications of the Association for Computing Machinery. 35(1992) no.12, S.29-38

Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 0.00

0.0021069439 = product of:
  0.01685555 = sum of:
    0.01685555 = product of:
      0.05056665 = sum of:
        0.05056665 = weight(_text_:29 in 5086) [ClassicSimilarity], result of:
          0.05056665 = score(doc=5086,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.46638384 = fieldWeight in 5086, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=5086)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 29. 3.1996 18:16:49

Allan, J.; Callan, J.P.; Croft, W.B.; Ballesteros, L.; Broglio, J.; Xu, J.; Shu, H.: INQUERY at TREC-5 (1997) 0.00

0.0017399922 = product of:
  0.013919937 = sum of:
    0.013919937 = product of:
      0.04175981 = sum of:
        0.04175981 = weight(_text_:22 in 3103) [ClassicSimilarity], result of:
          0.04175981 = score(doc=3103,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.38690117 = fieldWeight in 3103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3103)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 27. 2.1999 20:55:22

Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.00
```
0.0015337638 = product of:
  0.012270111 = sum of:
    0.012270111 = product of:
      0.03681033 = sum of:
        0.03681033 = weight(_text_:problem in 38) [ClassicSimilarity], result of:
          0.03681033 = score(doc=38,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.28137225 = fieldWeight in 38, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=38)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval
Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.00
```
0.0012781365 = product of:
  0.010225092 = sum of:
    0.010225092 = product of:
      0.030675275 = sum of:
        0.030675275 = weight(_text_:problem in 4277) [ClassicSimilarity], result of:
          0.030675275 = score(doc=4277,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.23447686 = fieldWeight in 4277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4277)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.

Search (7 results, page 1 of 1)

Authors

Years

Themes