Search (5 results, page 1 of 1)

Did you mean:
sbss%3a%22Pub A 91 %2f information%22 5

Callan, J.: Distributed information retrieval (2000) 0.01

0.009228281 = product of:
  0.0230707 = sum of:
    0.009535614 = weight(_text_:a in 31) [ClassicSimilarity], result of:
      0.009535614 = score(doc=31,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.17835285 = fieldWeight in 31, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=31)
    0.013535086 = product of:
      0.027070172 = sum of:
        0.027070172 = weight(_text_:information in 31) [ClassicSimilarity], result of:
          0.027070172 = score(doc=31,freq=12.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.3325631 = fieldWeight in 31, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=31)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: A multi-database model of distributed information retrieval is presented, in which people are assumed to have access to many searchable text databases. In such an environment, full-text information retrieval consists of discovering database contents, ranking databases by their expected ability to satisfy the query, searching a small number of databases, and merging results returned by different databases. This paper presents algorithms for each task. It also discusses how to reorganize conventional test collections into multi-database testbeds, and evaluation methodologies for multi-database experiments. A broad and diverse group of experimental results is presented to demonstrate that the algorithms are effective, efficient, robust, and scalable
Series: The Kluwer international series on information retrieval; 7
Source: Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft
Type: a

Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 0.01

0.008412599 = product of:
  0.021031497 = sum of:
    0.01155891 = weight(_text_:a in 5086) [ClassicSimilarity], result of:
      0.01155891 = score(doc=5086,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.2161963 = fieldWeight in 5086, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.09375 = fieldNorm(doc=5086)
    0.009472587 = product of:
      0.018945174 = sum of:
        0.018945174 = weight(_text_:information in 5086) [ClassicSimilarity], result of:
          0.018945174 = score(doc=5086,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.23274569 = fieldWeight in 5086, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=5086)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: TREC: experiment and evaluation in information retrieval. Ed.: E.M. Voorhees, u. D.K. Harman
Type: a

Robertson, S.; Callan, J.: Routing and filtering (2005) 0.01

0.008234787 = product of:
  0.020586967 = sum of:
    0.009535614 = weight(_text_:a in 4688) [ClassicSimilarity], result of:
      0.009535614 = score(doc=4688,freq=2.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.17835285 = fieldWeight in 4688, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.109375 = fieldNorm(doc=4688)
    0.011051352 = product of:
      0.022102704 = sum of:
        0.022102704 = weight(_text_:information in 4688) [ClassicSimilarity], result of:
          0.022102704 = score(doc=4688,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.27153665 = fieldWeight in 4688, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=4688)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: TREC: experiment and evaluation in information retrieval. Ed.: E.M. Voorhees, u. D.K. Harman
Type: a

Callan, J.; Croft, W.B.; Broglio, J.: TREC and TIPSTER experiments with INQUERY (1995) 0.01

0.007189882 = product of:
  0.017974705 = sum of:
    0.0068111527 = weight(_text_:a in 1944) [ClassicSimilarity], result of:
      0.0068111527 = score(doc=1944,freq=2.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.12739488 = fieldWeight in 1944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.078125 = fieldNorm(doc=1944)
    0.011163551 = product of:
      0.022327103 = sum of:
        0.022327103 = weight(_text_:information in 1944) [ClassicSimilarity], result of:
          0.022327103 = score(doc=1944,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.27429342 = fieldWeight in 1944, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1944)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.436-439.
Source: Information processing and management. 31(1995) no.3, S.327-343
Type: a

Collins-Thompson, K.; Callan, J.: Predicting reading difficulty with statistical language models (2005) 0.01
```
0.0063194023 = product of:
  0.015798505 = sum of:
    0.01021673 = weight(_text_:a in 4579) [ClassicSimilarity], result of:
      0.01021673 = score(doc=4579,freq=18.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.19109234 = fieldWeight in 4579, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4579)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 4579) [ClassicSimilarity], result of:
          0.011163551 = score(doc=4579,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 4579, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4579)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

A potentially useful feature of information retrieval systems for students is the ability to identify documents that not only are relevant to the query but also match the student's reading level. Manually obtaining an estimate of reading difficulty for each document is not feasible for very large collections, so we require an automated technique. Traditional readability measures, such as the widely used Flesch-Kincaid measure, are simple to apply but perform poorly an Web pages and other nontraditional documents. This work focuses an building a broadly applicable statistical model of text for different reading levels that works for a wide range of documents. To do this, we recast the weIl-studied problem of readability in terms of text categorization and use straightforward techniques from statistical language modeling. We show that with a modified form of text categorization, it is possible to build generally applicable cIassifiers with relatively little training data. We apply this method to the problem of classifying Web pages according to their reading difficulty level and show that by using a mixture model to interpolate evidence of a word's frequency across grades, it is possible to build a classifier that achieves an average root mean squared error of between one and two grade levels for 9 of 12 grades. Such cIassifiers have very efficient implementations and can be applied in many different scenarios. The models can be varied to focus an smaller or larger grade ranges or easily retrained for a variety of tasks or populations.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.13, S.1448-1462

Type

a

Search (5 results, page 1 of 1)

Authors

Years

Themes