Search (8 results, page 1 of 1)

Sparck Jones, K.; Walker, S.; Robertson, S.E.: ¬A probabilistic model of information retrieval : development and comparative experiments - part 1 (2000) 0.00

0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 4181) [ClassicSimilarity], result of:
          0.011481222 = score(doc=4181,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 4181, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=4181)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Sparck Jones, K.; Walker, S.; Robertson, S.E.: ¬A probabilistic model of information retrieval : development and comparative experiments - part 2 (2000) 0.00

0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 4286) [ClassicSimilarity], result of:
          0.011481222 = score(doc=4286,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 4286, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=4286)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Sparck Jones, K.: Some thoughts on classification for retrieval (2005) 0.00
```
0.0026742492 = product of:
  0.0053484985 = sum of:
    0.0053484985 = product of:
      0.010696997 = sum of:
        0.010696997 = weight(_text_:a in 4392) [ClassicSimilarity], result of:
          0.010696997 = score(doc=4392,freq=20.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20142901 = fieldWeight in 4392, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4392)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - This paper was originally published in 1970 (Journal of documentation. 26(1970), S.89-101), considered the suggestion that classifications for retrieval should be constructed automatically and raised some serious problems concerning the sorts of classification which were required, and the way in which formal classification theories should be exploited, given that a retrieval classification is required for a purpose. These difficulties had not been sufficiently considered, and the paper, therefore, aims to attempt an analysis of them, though no solutions of immediate application could be suggested. Design/methodology/approach - Starting with the illustrative proposition that a polythetic, multiple, unordered classification is required in automatic thesaurus construction, this is considered in the context of classification in general, where eight sorts of classification can be distinguished, each covering a range of class definitions and class-finding algorithms. Findings - Since there is generally no natural or best classification of a set of objects as such, the evaluation of alternative classifications requires either formal criteria of goodness of fit, or, if a classification is required for a purpose, a precise statement of that purpose. In any case a substantive theory of classification is needed, which does not exist; and, since sufficiently precise specifications of retrieval requirements are also lacking, the only currently available approach to automatic classification experiments for information retrieval is to do enough of them. Originality/value - Gives insights into the classification of material for information retrieval.

Type

a
Sparck Jones, K.: Revisiting classification for retrieval (2005) 0.00
```
0.0026473717 = product of:
  0.0052947435 = sum of:
    0.0052947435 = product of:
      0.010589487 = sum of:
        0.010589487 = weight(_text_:a in 4328) [ClassicSimilarity], result of:
          0.010589487 = score(doc=4328,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.19940455 = fieldWeight in 4328, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4328)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - This short note seeks to respond to Hjørland and Pederson's paper "A substantive theory of classification for information retrieval" which starts from Sparck Jones's, "Some thoughts on classification for retrieval", originally published in 1970. Design/methodology/approach - The note comments on the context in which the 1970 paper was written, and on Hjørland and Pedersen's views, emphasising the need for well-grounded classification theory and application. Findings - The note maintains that text-based, a posteriori, classification, as increasingly found in applications, is likely to be more useful, in general, than a priori classification. Originality/value - The note elaborates on points made in a well-received earlier paper.

Type

a

Sparck Jones, K.: IDF term weighting and IR research lessons (2004) 0.00

0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 4422) [ClassicSimilarity], result of:
          0.009567685 = score(doc=4422,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 4422, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=4422)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Robertson comments on the theoretical status of IDF term weighting. Its history illustrates how ideas develop in a specific research context, in theory/experiment interaction, and in operational practice.
Type: a

Sparck Jones, K.: ¬A statistical interpretation of term specificity and its application in retrieval (2004) 0.00
```
0.0020506454 = product of:
  0.004101291 = sum of:
    0.004101291 = product of:
      0.008202582 = sum of:
        0.008202582 = weight(_text_:a in 4420) [ClassicSimilarity], result of:
          0.008202582 = score(doc=4420,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.1544581 = fieldWeight in 4420, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4420)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing, in particular, that frequently-occurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.

Type

a

Sparck Jones, K.: Metareflections on TREC (2005) 0.00

0.0020296127 = product of:
  0.0040592253 = sum of:
    0.0040592253 = product of:
      0.008118451 = sum of:
        0.008118451 = weight(_text_:a in 5092) [ClassicSimilarity], result of:
          0.008118451 = score(doc=5092,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15287387 = fieldWeight in 5092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=5092)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Sparck Jones, K.: Automatic summarising : the state of the art (2007) 0.00

0.0010148063 = product of:
  0.0020296127 = sum of:
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = weight(_text_:a in 932) [ClassicSimilarity], result of:
          0.0040592253 = score(doc=932,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.07643694 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Search (8 results, page 1 of 1)

Authors

Themes