Search (31 results, page 1 of 2)

Croft, W.B.: Hypertext and information retrieval : what are the fundamental concepts? (1990) 0.05

0.05414349 = product of:
  0.16243047 = sum of:
    0.10067343 = weight(_text_:applications in 8003) [ClassicSimilarity], result of:
      0.10067343 = score(doc=8003,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 8003, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.09375 = fieldNorm(doc=8003)
    0.012701439 = weight(_text_:of in 8003) [ClassicSimilarity], result of:
      0.012701439 = score(doc=8003,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20732689 = fieldWeight in 8003, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=8003)
    0.0490556 = weight(_text_:systems in 8003) [ClassicSimilarity], result of:
      0.0490556 = score(doc=8003,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.4074492 = fieldWeight in 8003, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.09375 = fieldNorm(doc=8003)
  0.33333334 = coord(3/9)

Source: Hypertext: concepts, systems and applications: Proceedings of the First European Conference on Hypertext, INRIA, France, Nov. 1990. Ed.: N. Streitz et al

Croft, W.B.: Knowledge-based and statistical approaches to text retrieval (1993) 0.04

0.044364154 = product of:
  0.1996387 = sum of:
    0.13423124 = weight(_text_:applications in 7863) [ClassicSimilarity], result of:
      0.13423124 = score(doc=7863,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.7782611 = fieldWeight in 7863, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.125 = fieldNorm(doc=7863)
    0.06540746 = weight(_text_:systems in 7863) [ClassicSimilarity], result of:
      0.06540746 = score(doc=7863,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.5432656 = fieldWeight in 7863, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.125 = fieldNorm(doc=7863)
  0.22222222 = coord(2/9)

Source: IEEE expert intelligent systems and their applications. 8(1993) no.2, S.8-12

Croft, W.B.: Effective retrieval based on combining evidence from the corpus and users (1995) 0.02

0.018173773 = product of:
  0.081781976 = sum of:
    0.06711562 = weight(_text_:applications in 4489) [ClassicSimilarity], result of:
      0.06711562 = score(doc=4489,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.38913056 = fieldWeight in 4489, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0625 = fieldNorm(doc=4489)
    0.014666359 = weight(_text_:of in 4489) [ClassicSimilarity], result of:
      0.014666359 = score(doc=4489,freq=6.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23940048 = fieldWeight in 4489, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=4489)
  0.22222222 = coord(2/9)

Abstract: Inquery is a text retrieval system that is the basis of a number of WWW applications, including the Thomas system supported by the Library of Congress. Surveys the representation, query processing, and retrieval techniques used in the system. By combining evidence about relevance from the corpus, individual documents, and users, Inquery achieves effective overall recall and precision evaluation while avoiding occasional major failures

Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.02
```
0.017583163 = product of:
  0.079124235 = sum of:
    0.059322387 = weight(_text_:applications in 4277) [ClassicSimilarity], result of:
      0.059322387 = score(doc=4277,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34394607 = fieldWeight in 4277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
    0.019801848 = weight(_text_:of in 4277) [ClassicSimilarity], result of:
      0.019801848 = score(doc=4277,freq=28.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32322758 = fieldWeight in 4277, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
  0.22222222 = coord(2/9)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.

Source

Annual review of information science and technology. 39(2005), S.3-32

Croft, W.B.; Thompson, R.H.: I3R: a new approach to the desing of document retrieval systems (1987) 0.02

0.017375076 = product of:
  0.07818784 = sum of:
    0.020956306 = weight(_text_:of in 3898) [ClassicSimilarity], result of:
      0.020956306 = score(doc=3898,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.34207192 = fieldWeight in 3898, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=3898)
    0.057231534 = weight(_text_:systems in 3898) [ClassicSimilarity], result of:
      0.057231534 = score(doc=3898,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.47535738 = fieldWeight in 3898, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.109375 = fieldNorm(doc=3898)
  0.22222222 = coord(2/9)

Source: Journal of the American Society for Information Science. 38(1987), S.389-404

Belkin, N.J.; Croft, W.B.: Retrieval techniques (1987) 0.01

0.013199662 = product of:
  0.05939848 = sum of:
    0.016935252 = weight(_text_:of in 334) [ClassicSimilarity], result of:
      0.016935252 = score(doc=334,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.27643585 = fieldWeight in 334, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.125 = fieldNorm(doc=334)
    0.042463228 = product of:
      0.084926456 = sum of:
        0.084926456 = weight(_text_:22 in 334) [ClassicSimilarity], result of:
          0.084926456 = score(doc=334,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.61904186 = fieldWeight in 334, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=334)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)

Source: Annual review of information science and technology. 22(1987), S.109-145

Croft, W.B.: Advances in information retrieval : Recent research from the Center for Intelligent Information Retrieval (2000) 0.01

0.010152737 = product of:
  0.045687314 = sum of:
    0.010999769 = weight(_text_:of in 6860) [ClassicSimilarity], result of:
      0.010999769 = score(doc=6860,freq=6.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.17955035 = fieldWeight in 6860, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=6860)
    0.034687545 = weight(_text_:systems in 6860) [ClassicSimilarity], result of:
      0.034687545 = score(doc=6860,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.28811008 = fieldWeight in 6860, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=6860)
  0.22222222 = coord(2/9)

Content: Enthält die Beiträge: CROFT, W.B.: Combining approaches to information retrieval; GREIFF, W.R.: The use of exploratory data analysis in information retrieval research; PONTE, J.M.: Language models for relevance feedback; PAPKA, R. u. J. ALLAN: Topic detection and tracking: event clustering as a basis for first story detection; CALLAN, J.: Distributed information retrieval; XU, J. u. W.B. CROFT: Topic-based language models for ditributed retrieval; LU, Z. u. K.S. McKINLEY: The effect of collection organization and query locality on information retrieval system performance; BALLESTEROS, L.A.: Cross-language retrieval via transitive translation; SANDERSON, M. u. D. LAWRIE: Building, testing, and applying concept hierarchies; RAVELA, S. u. C. LUO: Appearance-based global similarity retrieval of images
LCSH: Multimedia systems
Subject: Multimedia systems

Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.01

0.009210852 = product of:
  0.04144883 = sum of:
    0.0128330635 = weight(_text_:of in 4028) [ClassicSimilarity], result of:
      0.0128330635 = score(doc=4028,freq=6.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20947541 = fieldWeight in 4028, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4028)
    0.028615767 = weight(_text_:systems in 4028) [ClassicSimilarity], result of:
      0.028615767 = score(doc=4028,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 4028, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4028)
  0.22222222 = coord(2/9)

Abstract: Reports on an analysis of lexical ambiguity in information retrieval text collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. Results show that there is considerable ambiguity even in a specialised database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance such as: resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Discusses other uses of word sense disambiguation in an information retrieval context
Source: ACM transactions on information systems. 10(1992) no.2, S.115-141

Jing, Y.; Croft, W.B.: ¬An association thesaurus for information retrieval (199?) 0.01

0.008687538 = product of:
  0.03909392 = sum of:
    0.010478153 = weight(_text_:of in 4494) [ClassicSimilarity], result of:
      0.010478153 = score(doc=4494,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.17103596 = fieldWeight in 4494, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4494)
    0.028615767 = weight(_text_:systems in 4494) [ClassicSimilarity], result of:
      0.028615767 = score(doc=4494,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 4494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4494)
  0.22222222 = coord(2/9)

Abstract: Although commonly used in both commercial and experimental information retrieval systems, thesauri have not demonstrated consistent benefits for retrieval performance, and it is difficult to construct a thesaurus automatically for large text databases. In this paper, an approach, called PhraseFinder, is proposed to construct collection-dependent association thesauri automatically using large full-text document collections. The association thesaurus can be accessed through natural language queries in INQUERY, an information retrieval system based on the probabilistic inference network. Experiments are conducted in INQUERY to evaluate different types of association thesauri, and thesauri constructed for a variety of collections

Allan, J.; Callan, J.P.; Croft, W.B.; Ballesteros, L.; Broglio, J.; Xu, J.; Shu, H.: INQUERY at TREC-5 (1997) 0.01

0.008249789 = product of:
  0.03712405 = sum of:
    0.010584532 = weight(_text_:of in 3103) [ClassicSimilarity], result of:
      0.010584532 = score(doc=3103,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.17277241 = fieldWeight in 3103, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=3103)
    0.026539518 = product of:
      0.053079035 = sum of:
        0.053079035 = weight(_text_:22 in 3103) [ClassicSimilarity], result of:
          0.053079035 = score(doc=3103,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.38690117 = fieldWeight in 3103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3103)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)

Date: 27. 2.1999 20:55:22
Imprint: Gaithersburgh, MD : National Institute of Standards and Technology

Croft, W.B.: What do people want from information retrieval? : the top 10 research issues for companies that use and sell IR systems (1995) 0.01

0.005450622 = product of:
  0.0490556 = sum of:
    0.0490556 = weight(_text_:systems in 3402) [ClassicSimilarity], result of:
      0.0490556 = score(doc=3402,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.4074492 = fieldWeight in 3402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.09375 = fieldNorm(doc=3402)
  0.11111111 = coord(1/9)

Belkin, N.J.; Croft, W.B.: Information filtering and information retrieval : two sides of the same coin? (1992) 0.00

0.003155698 = product of:
  0.028401282 = sum of:
    0.028401282 = weight(_text_:of in 6093) [ClassicSimilarity], result of:
      0.028401282 = score(doc=6093,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.46359703 = fieldWeight in 6093, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=6093)
  0.11111111 = coord(1/9)

Abstract: One of nine articles in this issue of Communications of the ACM devoted to information filtering
Source: Communications of the Association for Computing Machinery. 35(1992) no.12, S.29-38

Croft, W.B.: Clustering large files of documents using the single link method (1977) 0.00

0.002661118 = product of:
  0.023950063 = sum of:
    0.023950063 = weight(_text_:of in 5489) [ClassicSimilarity], result of:
      0.023950063 = score(doc=5489,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.39093933 = fieldWeight in 5489, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.125 = fieldNorm(doc=5489)
  0.11111111 = coord(1/9)

Source: Journal of the American Society for Information Science. 28(1977), S.341-344

Croft, W.B.: Automatic indexing : file organization and display for information retrieval (1989) 0.00

0.0026297483 = product of:
  0.023667734 = sum of:
    0.023667734 = weight(_text_:of in 2412) [ClassicSimilarity], result of:
      0.023667734 = score(doc=2412,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.38633084 = fieldWeight in 2412, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=2412)
  0.11111111 = coord(1/9)

Source: Indexing: the state of our knowledge and the state of our ignorance. Proceedings of the 20th Annual Meeting of the American Society of Indexers, New York City, May 13, 1988. Ed.: B.H. Weinberg

Croft, W.B.; Harper, D.J.: Using probabilistic models of document retrieval without relevance information (1979) 0.00
```
0.002304596 = product of:
  0.020741362 = sum of:
    0.020741362 = weight(_text_:of in 4520) [ClassicSimilarity], result of:
      0.020741362 = score(doc=4520,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.33856338 = fieldWeight in 4520, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=4520)
  0.11111111 = coord(1/9)
```
Abstract

Based on a probablistic model, proposes strategies for the initial search and an intermediate search. Retrieval experiences with the Cranfield collection of 1,400 documents show that this initial search strategy is better than conventional search strategies both in terms of retrieval effectiveness and in terms of the number of queries that retrieve relevant documents. The intermediate search is a useful substitute for a relevance feedback search. A cluster search would be an effective alternative strategy.

Source

Journal of documentation. 35(1979) no.4, S.285-295
Tavakoli, L.; Zamani, H.; Scholer, F.; Croft, W.B.; Sanderson, M.: Analyzing clarification in asynchronous information-seeking conversations (2022) 0.00
```
0.0022314154 = product of:
  0.020082738 = sum of:
    0.020082738 = weight(_text_:of in 496) [ClassicSimilarity], result of:
      0.020082738 = score(doc=496,freq=20.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32781258 = fieldWeight in 496, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=496)
  0.11111111 = coord(1/9)
```
Abstract

This research analyzes human-generated clarification questions to provide insights into how they are used to disambiguate and provide a better understanding of information needs. A set of clarification questions is extracted from posts on the Stack Exchange platform. Novel taxonomy is defined for the annotation of the questions and their responses. We investigate the clarification questions in terms of whether they add any information to the post (the initial question posted by the asker) and the accepted answer, which is the answer chosen by the asker. After identifying, which clarification questions are more useful, we investigated the characteristics of these questions in terms of their types and patterns. Non-useful clarification questions are identified, and their patterns are compared with useful clarifications. Our analysis indicates that the most useful clarification questions have similar patterns, regardless of topic. This research contributes to an understanding of clarification in conversations and can provide insight for clarification dialogues in conversational search scenarios and for the possible system generation of clarification requests in information-seeking conversations.

Source

Journal of the Association for Information Science and Technology. 73(2022) no.3, S.449-471
Murdock, V.; Kelly, D.; Croft, W.B.; Belkin, N.J.; Yuan, X.: Identifying and improving retrieval for procedural questions (2007) 0.00
```
0.0021169065 = product of:
  0.019052157 = sum of:
    0.019052157 = weight(_text_:of in 902) [ClassicSimilarity], result of:
      0.019052157 = score(doc=902,freq=18.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.3109903 = fieldWeight in 902, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=902)
  0.11111111 = coord(1/9)
```
Abstract

People use questions to elicit information from other people in their everyday lives and yet the most common method of obtaining information from a search engine is by posing keywords. There has been research that suggests users are better at expressing their information needs in natural language, however the vast majority of work to improve document retrieval has focused on queries posed as sets of keywords or Boolean queries. This paper focuses on improving document retrieval for the subset of natural language questions asking about how something is done. We classify questions as asking either for a description of a process or asking for a statement of fact, with better than 90% accuracy. Further we identify non-content features of documents relevant to questions asking about a process. Finally we demonstrate that we can use these features to significantly improve the precision of document retrieval results for questions asking about a process. Our approach, based on exploiting the structure of documents, shows a significant improvement in precision at rank one for questions asking about how something is done.
Rajashekar, T.B.; Croft, W.B.: Combining automatic and manual index representations in probabilistic retrieval (1995) 0.00
```
0.0020165213 = product of:
  0.018148692 = sum of:
    0.018148692 = weight(_text_:of in 2418) [ClassicSimilarity], result of:
      0.018148692 = score(doc=2418,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.29624295 = fieldWeight in 2418, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2418)
  0.11111111 = coord(1/9)
```
Abstract

Results from research in information retrieval have suggested that significant improvements in retrieval effectiveness can be obtained by combining results from multiple index representioms, query formulations, and search strategies. The inference net model of retrieval, which was designed from this point of view, treats information retrieval as an evidental reasoning process where multiple sources of evidence about document and query content are combined to estimate relevance probabilities. Uses a system based on this model to study the retrieval effectiveness benefits of combining these types of document and query information that are found in typical commercial databases and information services. The results indicate that substantial real benefits are possible

Source

Journal of the American Society for Information Science. 46(1995) no.4, S.272-283
Croft, W.B.: Combining approaches to information retrieval (2000) 0.00
```
0.0018669361 = product of:
  0.016802425 = sum of:
    0.016802425 = weight(_text_:of in 6862) [ClassicSimilarity], result of:
      0.016802425 = score(doc=6862,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 6862, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=6862)
  0.11111111 = coord(1/9)
```
Abstract

The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the "meta-search" engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model
Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.00
```
0.0017284468 = product of:
  0.015556021 = sum of:
    0.015556021 = weight(_text_:of in 38) [ClassicSimilarity], result of:
      0.015556021 = score(doc=38,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.25392252 = fieldWeight in 38, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=38)
  0.11111111 = coord(1/9)
```
Abstract

Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval

Search (31 results, page 1 of 2)

Authors

Years

Types

Themes

Subjects

Classifications