Search (30 results, page 2 of 2)

Croft, W.B.: Clustering large files of documents using the single link method (1977) 0.00

0.0019191063 = product of:
  0.02111017 = sum of:
    0.02111017 = weight(_text_:of in 5489) [ClassicSimilarity], result of:
      0.02111017 = score(doc=5489,freq=4.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.39093933 = fieldWeight in 5489, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.125 = fieldNorm(doc=5489)
  0.09090909 = coord(1/11)

Source: Journal of the American Society for Information Science. 28(1977), S.341-344

Croft, W.B.: Automatic indexing : file organization and display for information retrieval (1989) 0.00

0.0018964836 = product of:
  0.020861318 = sum of:
    0.020861318 = weight(_text_:of in 2412) [ClassicSimilarity], result of:
      0.020861318 = score(doc=2412,freq=10.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.38633084 = fieldWeight in 2412, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=2412)
  0.09090909 = coord(1/11)

Source: Indexing: the state of our knowledge and the state of our ignorance. Proceedings of the 20th Annual Meeting of the American Society of Indexers, New York City, May 13, 1988. Ed.: B.H. Weinberg

Croft, W.B.; Thompson, R.H.: I3R: a new approach to the desing of document retrieval systems (1987) 0.00

0.0016792181 = product of:
  0.0184714 = sum of:
    0.0184714 = weight(_text_:of in 3898) [ClassicSimilarity], result of:
      0.0184714 = score(doc=3898,freq=4.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.34207192 = fieldWeight in 3898, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=3898)
  0.09090909 = coord(1/11)

Source: Journal of the American Society for Information Science. 38(1987), S.389-404

Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.00
```
0.001586712 = product of:
  0.01745383 = sum of:
    0.01745383 = weight(_text_:of in 4277) [ClassicSimilarity], result of:
      0.01745383 = score(doc=4277,freq=28.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.32322758 = fieldWeight in 4277, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
  0.09090909 = coord(1/11)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.

Source

Annual review of information science and technology. 39(2005), S.3-32

Croft, W.B.; Thompson, R.H.: Support for browsing in an intelligent text retrieval system (1989) 0.00

0.0011873866 = product of:
  0.0130612515 = sum of:
    0.0130612515 = weight(_text_:of in 5004) [ClassicSimilarity], result of:
      0.0130612515 = score(doc=5004,freq=2.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.24188137 = fieldWeight in 5004, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=5004)
  0.09090909 = coord(1/11)

Source: International journal of man-machine studies. 30(1989), S.639-668

Allan, J.; Ballesteros, L.; Callan, J.P.; Croft, W.B.; Lu, Z.: Recent experiment with INQUERY (1996) 0.00

0.0010177598 = product of:
  0.011195358 = sum of:
    0.011195358 = weight(_text_:of in 7568) [ClassicSimilarity], result of:
      0.011195358 = score(doc=7568,freq=2.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.20732689 = fieldWeight in 7568, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=7568)
  0.09090909 = coord(1/11)

Imprint: Gaithersburgh, MD : National Institute of Standards and Technology

Liu, X.; Croft, W.B.: Cluster-based retrieval using language models (2004) 0.00

0.0010177598 = product of:
  0.011195358 = sum of:
    0.011195358 = weight(_text_:of in 4115) [ClassicSimilarity], result of:
      0.011195358 = score(doc=4115,freq=2.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.20732689 = fieldWeight in 4115, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=4115)
  0.09090909 = coord(1/11)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 0.00

0.0010177598 = product of:
  0.011195358 = sum of:
    0.011195358 = weight(_text_:of in 5086) [ClassicSimilarity], result of:
      0.011195358 = score(doc=5086,freq=2.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.20732689 = fieldWeight in 5086, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=5086)
  0.09090909 = coord(1/11)

Croft, W.B.; Metzler, D.; Strohman, T.: Search engines : information retrieval in practice (2010) 0.00
```
0.0010177598 = product of:
  0.011195358 = sum of:
    0.011195358 = weight(_text_:of in 2605) [ClassicSimilarity], result of:
      0.011195358 = score(doc=2605,freq=8.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.20732689 = fieldWeight in 2605, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2605)
  0.09090909 = coord(1/11)
```
Abstract

For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
Luk, R.W.P.; Leong, H.V.; Dillon, T.S.; Chan, A.T.S.; Croft, W.B.; Allen, J.: ¬A survey in indexing and searching XML documents (2002) 0.00
```
7.196649E-4 = product of:
  0.007916314 = sum of:
    0.007916314 = weight(_text_:of in 460) [ClassicSimilarity], result of:
      0.007916314 = score(doc=460,freq=4.0), product of:
        0.053998582 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.034531306 = queryNorm
        0.14660224 = fieldWeight in 460, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=460)
  0.09090909 = coord(1/11)
```
Abstract

XML holds the promise to yield (1) a more precise search by providing additional information in the elements, (2) a better integrated search of documents from heterogeneous sources, (3) a powerful search paradigm using structural as well as content specifications, and (4) data and information exchange to share resources and to support cooperative search. We survey several indexing techniques for XML documents, grouping them into flatfile, semistructured, and structured indexing paradigms. Searching techniques and supporting techniques for searching are reviewed, including full text search and multistage search. Because searching XML documents can be very flexible, various search result presentations are discussed, as well as database and information retrieval system integration and XML query languages. We also survey various retrieval models, examining how they would be used or extended for retrieving XML documents. To conclude the article, we discuss various open issues that XML poses with respect to information retrieval and database research.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.6, S.415-437

Search (30 results, page 2 of 2)

Authors

Years

Types

Themes

Subjects

Classifications