Search (31 results, page 2 of 2)

  • × author_ss:"Croft, W.B."
  1. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.00
    0.0042441967 = product of:
      0.021220984 = sum of:
        0.021220984 = weight(_text_:information in 4277) [ClassicSimilarity], result of:
          0.021220984 = score(doc=4277,freq=14.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.256578 = fieldWeight in 4277, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4277)
      0.2 = coord(1/5)
    
    Abstract
    This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
    Source
    Annual review of information science and technology. 39(2005), S.3-32
  2. Croft, W.B.: Hypertext and information retrieval : what are the fundamental concepts? (1990) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 8003) [ClassicSimilarity], result of:
          0.019249868 = score(doc=8003,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274569 = fieldWeight in 8003, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=8003)
      0.2 = coord(1/5)
    
  3. Croft, W.B.: What do people want from information retrieval? : the top 10 research issues for companies that use and sell IR systems (1995) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 3402) [ClassicSimilarity], result of:
          0.019249868 = score(doc=3402,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274569 = fieldWeight in 3402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=3402)
      0.2 = coord(1/5)
    
  4. Liu, X.; Croft, W.B.: Cluster-based retrieval using language models (2004) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 4115) [ClassicSimilarity], result of:
          0.019249868 = score(doc=4115,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274569 = fieldWeight in 4115, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=4115)
      0.2 = coord(1/5)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  5. Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 5086) [ClassicSimilarity], result of:
          0.019249868 = score(doc=5086,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274569 = fieldWeight in 5086, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=5086)
      0.2 = coord(1/5)
    
    Source
    TREC: experiment and evaluation in information retrieval. Ed.: E.M. Voorhees, u. D.K. Harman
  6. Murdock, V.; Kelly, D.; Croft, W.B.; Belkin, N.J.; Yuan, X.: Identifying and improving retrieval for procedural questions (2007) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 902) [ClassicSimilarity], result of:
          0.019249868 = score(doc=902,freq=8.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274569 = fieldWeight in 902, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=902)
      0.2 = coord(1/5)
    
    Abstract
    People use questions to elicit information from other people in their everyday lives and yet the most common method of obtaining information from a search engine is by posing keywords. There has been research that suggests users are better at expressing their information needs in natural language, however the vast majority of work to improve document retrieval has focused on queries posed as sets of keywords or Boolean queries. This paper focuses on improving document retrieval for the subset of natural language questions asking about how something is done. We classify questions as asking either for a description of a process or asking for a statement of fact, with better than 90% accuracy. Further we identify non-content features of documents relevant to questions asking about a process. Finally we demonstrate that we can use these features to significantly improve the precision of document retrieval results for questions asking about a process. Our approach, based on exploiting the structure of documents, shows a significant improvement in precision at rank one for questions asking about how something is done.
    Source
    Information processing and management. 43(2007) no.1, S.181-203
  7. Croft, W.B.: What do people want from information retrieval? (1997) 0.00
    0.00362979 = product of:
      0.01814895 = sum of:
        0.01814895 = weight(_text_:information in 589) [ClassicSimilarity], result of:
          0.01814895 = score(doc=589,freq=4.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.21943474 = fieldWeight in 589, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=589)
      0.2 = coord(1/5)
    
    Imprint
    The Hague : International Federation for Information and Documentation (FID)
  8. Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.00
    0.003334175 = product of:
      0.016670875 = sum of:
        0.016670875 = weight(_text_:information in 38) [ClassicSimilarity], result of:
          0.016670875 = score(doc=38,freq=6.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.20156369 = fieldWeight in 38, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=38)
      0.2 = coord(1/5)
    
    Series
    The Kluwer international series on information retrieval; 7
    Source
    Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft
  9. Croft, W.B.: Automatic indexing : file organization and display for information retrieval (1989) 0.00
    0.0032083113 = product of:
      0.016041556 = sum of:
        0.016041556 = weight(_text_:information in 2412) [ClassicSimilarity], result of:
          0.016041556 = score(doc=2412,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.19395474 = fieldWeight in 2412, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=2412)
      0.2 = coord(1/5)
    
  10. Croft, W.B.; Harper, D.J.: Using probabilistic models of document retrieval without relevance information (1979) 0.00
    0.0025666493 = product of:
      0.012833246 = sum of:
        0.012833246 = weight(_text_:information in 4520) [ClassicSimilarity], result of:
          0.012833246 = score(doc=4520,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.1551638 = fieldWeight in 4520, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4520)
      0.2 = coord(1/5)
    
  11. Kim, Y.; Seo, J.; Croft, W.B.; Smith, D.A.: Automatic suggestion of phrasal-concept queries for literature search (2014) 0.00
    0.0016041556 = product of:
      0.008020778 = sum of:
        0.008020778 = weight(_text_:information in 2692) [ClassicSimilarity], result of:
          0.008020778 = score(doc=2692,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.09697737 = fieldWeight in 2692, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2692)
      0.2 = coord(1/5)
    
    Source
    Information processing and management. 50(2014) no.4, S.568-583