Search (8 results, page 1 of 1)

Sparck Jones, K.: IDF term weighting and IR research lessons (2004) 0.01

0.0068172095 = product of:
  0.027268838 = sum of:
    0.027268838 = product of:
      0.08180651 = sum of:
        0.08180651 = weight(_text_:theory in 4422) [ClassicSimilarity], result of:
          0.08180651 = score(doc=4422,freq=2.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.4594418 = fieldWeight in 4422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.078125 = fieldNorm(doc=4422)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Abstract: Robertson comments on the theoretical status of IDF term weighting. Its history illustrates how ideas develop in a specific research context, in theory/experiment interaction, and in operational practice.

Sparck Jones, K.: Revisiting classification for retrieval (2005) 0.01
```
0.0067486926 = product of:
  0.02699477 = sum of:
    0.02699477 = product of:
      0.08098431 = sum of:
        0.08098431 = weight(_text_:theory in 4328) [ClassicSimilarity], result of:
          0.08098431 = score(doc=4328,freq=4.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.45482418 = fieldWeight in 4328, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4328)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

Purpose - This short note seeks to respond to Hjørland and Pederson's paper "A substantive theory of classification for information retrieval" which starts from Sparck Jones's, "Some thoughts on classification for retrieval", originally published in 1970. Design/methodology/approach - The note comments on the context in which the 1970 paper was written, and on Hjørland and Pedersen's views, emphasising the need for well-grounded classification theory and application. Findings - The note maintains that text-based, a posteriori, classification, as increasingly found in applications, is likely to be more useful, in general, than a priori classification. Originality/value - The note elaborates on points made in a well-received earlier paper.

Sparck Jones, K.; Walker, S.; Robertson, S.E.: ¬A probabilistic model of information retrieval : development and comparative experiments - part 1 (2000) 0.01

0.005853982 = product of:
  0.023415929 = sum of:
    0.023415929 = product of:
      0.070247784 = sum of:
        0.070247784 = weight(_text_:29 in 4181) [ClassicSimilarity], result of:
          0.070247784 = score(doc=4181,freq=2.0), product of:
            0.15062225 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.042818543 = queryNorm
            0.46638384 = fieldWeight in 4181, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=4181)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 27.12.2007 19:27:29

Sparck Jones, K.: Metareflections on TREC (2005) 0.01

0.005853982 = product of:
  0.023415929 = sum of:
    0.023415929 = product of:
      0.070247784 = sum of:
        0.070247784 = weight(_text_:29 in 5092) [ClassicSimilarity], result of:
          0.070247784 = score(doc=5092,freq=2.0), product of:
            0.15062225 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.042818543 = queryNorm
            0.46638384 = fieldWeight in 5092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=5092)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 29. 3.1996 18:16:49

Robertson, S.E.; Sparck Jones, K.: Relevance weighting of search terms (1976) 0.01

0.0054537673 = product of:
  0.021815069 = sum of:
    0.021815069 = product of:
      0.06544521 = sum of:
        0.06544521 = weight(_text_:theory in 71) [ClassicSimilarity], result of:
          0.06544521 = score(doc=71,freq=2.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.36755344 = fieldWeight in 71, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.0625 = fieldNorm(doc=71)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Abstract: Examines statistical techniques for exploiting relevance information to weight search terms. These techniques are presented as a natural extension of weighting methods using information about the distribution of index terms in documents in general. A series of relevance weighting functions is derived and is justified by theoretical considerations. In particular, it is shown that specific weighted search methods are implied by a general probabilistic theory of retrieval. Different applications of relevance weighting are illustrated by experimental results for test collections

Sparck Jones, K.: Some thoughts on classification for retrieval (1970) 0.00
```
0.0034086048 = product of:
  0.013634419 = sum of:
    0.013634419 = product of:
      0.040903255 = sum of:
        0.040903255 = weight(_text_:theory in 4327) [ClassicSimilarity], result of:
          0.040903255 = score(doc=4327,freq=2.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.2297209 = fieldWeight in 4327, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4327)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

The suggestion that classifications for retrieval should be constructed automatically raises some serious problems concerning the sorts of classification which are required, and the way in which formal classification theories should be exploited, given that a retrieval classification is required for a purpose. These difficulties have not been sufficiently considered, and the paper therefore attempts an analysis of them, though no solution of immediate application can be suggested. Starting with the illustrative proposition that a polythetic, multiple, unordered classification is required in automatic thesaurus construction, this is considered in the context of classification in general, where eight sorts of classification can be distinguished, each covering a range of class definitions and class-finding algorithms. The problem which follows is that since there is generally no natural or best classification of a set of objects as such, the evaluation of alternative classifications requires either formal criteria of goodness of fit, or, if a classification is required for a purpose, a precises statement of that purpose. In any case a substantive theory of classification is needed, which does not exist; and since sufficiently precise specifications of retrieval requirements are also lacking, the only currently available approach to automatic classification experiments for information retrieval is to do enough of them
Sparck Jones, K.: Some thoughts on classification for retrieval (2005) 0.00
```
0.0034086048 = product of:
  0.013634419 = sum of:
    0.013634419 = product of:
      0.040903255 = sum of:
        0.040903255 = weight(_text_:theory in 4392) [ClassicSimilarity], result of:
          0.040903255 = score(doc=4392,freq=2.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.2297209 = fieldWeight in 4392, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4392)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

Purpose - This paper was originally published in 1970 (Journal of documentation. 26(1970), S.89-101), considered the suggestion that classifications for retrieval should be constructed automatically and raised some serious problems concerning the sorts of classification which were required, and the way in which formal classification theories should be exploited, given that a retrieval classification is required for a purpose. These difficulties had not been sufficiently considered, and the paper, therefore, aims to attempt an analysis of them, though no solutions of immediate application could be suggested. Design/methodology/approach - Starting with the illustrative proposition that a polythetic, multiple, unordered classification is required in automatic thesaurus construction, this is considered in the context of classification in general, where eight sorts of classification can be distinguished, each covering a range of class definitions and class-finding algorithms. Findings - Since there is generally no natural or best classification of a set of objects as such, the evaluation of alternative classifications requires either formal criteria of goodness of fit, or, if a classification is required for a purpose, a precise statement of that purpose. In any case a substantive theory of classification is needed, which does not exist; and, since sufficiently precise specifications of retrieval requirements are also lacking, the only currently available approach to automatic classification experiments for information retrieval is to do enough of them. Originality/value - Gives insights into the classification of material for information retrieval.

Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00

0.0023860233 = product of:
  0.009544093 = sum of:
    0.009544093 = product of:
      0.028632278 = sum of:
        0.028632278 = weight(_text_:theory in 3645) [ClassicSimilarity], result of:
          0.028632278 = score(doc=3645,freq=2.0), product of:
            0.1780563 = queryWeight, product of:
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.042818543 = queryNorm
            0.16080463 = fieldWeight in 3645, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1583924 = idf(docFreq=1878, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3645)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Source: Theory of subject analysis: a sourcebook. Ed.: L.M. Chan, et al

Search (8 results, page 1 of 1)

Authors

Years

Themes