Document (#16042)

Author
Sanderson, M.
Title
¬The Reuters test collection
Source
Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
Imprint
London : Taylor Graham
Year
1996
Pages
S.219-227
Abstract
Describes the Reuters test collection, which at 22.173 references is significantly larger than most traditional test collections. In addition, Reuters has none of the recall calculation problems normally associated with some of the larger test collections available. Explains the method derived by D.D. Lewis to perform retrieval experiments on the Reuters collection and illustrates the use of the Reuters collection using some simple retrieval experiments that compare the performance of stemming algorithms
Theme
Retrievalstudien

Similar documents (author)

  1. Sanderson, M.: Revisiting h measured on UK LIS and IR academics (2008) 5.39
    5.3864803 = sum of:
      5.3864803 = weight(author_txt:sanderson in 3868) [ClassicSimilarity], result of:
        5.3864803 = fieldWeight in 3868, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.618368 = idf(docFreq=20, maxDocs=42740)
          0.625 = fieldNorm(doc=3868)
    
  2. Purves, R.S.; Sanderson, M.: ¬A methodology to allow avalanche forecasting on an information retrieval system (1998) 4.31
    4.309184 = sum of:
      4.309184 = weight(author_txt:sanderson in 2074) [ClassicSimilarity], result of:
        4.309184 = fieldWeight in 2074, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.618368 = idf(docFreq=20, maxDocs=42740)
          0.5 = fieldNorm(doc=2074)
    
  3. Sanderson, M.; Ruthven, I.: Report on the Glasgow IR group (glair4) submission (1997) 4.31
    4.309184 = sum of:
      4.309184 = weight(author_txt:sanderson in 4089) [ClassicSimilarity], result of:
        4.309184 = fieldWeight in 4089, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.618368 = idf(docFreq=20, maxDocs=42740)
          0.5 = fieldNorm(doc=4089)
    
  4. Sanderson, M.; Lawrie, D.: Building, testing, and applying concept hierarchies (2000) 4.31
    4.309184 = sum of:
      4.309184 = weight(author_txt:sanderson in 1038) [ClassicSimilarity], result of:
        4.309184 = fieldWeight in 1038, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.618368 = idf(docFreq=20, maxDocs=42740)
          0.5 = fieldNorm(doc=1038)
    
  5. Clough, P.; Sanderson, M.: User experiments with the Eurovision Cross-Language Image Retrieval System (2006) 4.31
    4.309184 = sum of:
      4.309184 = weight(author_txt:sanderson in 53) [ClassicSimilarity], result of:
        4.309184 = fieldWeight in 53, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.618368 = idf(docFreq=20, maxDocs=42740)
          0.5 = fieldNorm(doc=53)
    

Similar documents (content)

  1. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 0.17
    0.16689546 = sum of:
      0.16689546 = product of:
        0.8344773 = sum of:
          0.029170986 = weight(abstract_txt:compare in 4457) [ClassicSimilarity], result of:
            0.029170986 = score(doc=4457,freq=1.0), product of:
              0.08301014 = queryWeight, product of:
                1.0983902 = boost
                5.622636 = idf(docFreq=419, maxDocs=42740)
                0.013441091 = queryNorm
              0.35141474 = fieldWeight in 4457, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.622636 = idf(docFreq=419, maxDocs=42740)
                0.0625 = fieldNorm(doc=4457)
          0.013651999 = weight(abstract_txt:retrieval in 4457) [ClassicSimilarity], result of:
            0.013651999 = score(doc=4457,freq=1.0), product of:
              0.06304315 = queryWeight, product of:
                1.3537081 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.013441091 = queryNorm
              0.21655008 = fieldWeight in 4457, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=4457)
          0.09404241 = weight(abstract_txt:collection in 4457) [ClassicSimilarity], result of:
            0.09404241 = score(doc=4457,freq=2.0), product of:
              0.22823885 = queryWeight, product of:
                3.6426368 = boost
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.013441091 = queryNorm
              0.41203508 = fieldWeight in 4457, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.0625 = fieldNorm(doc=4457)
          0.08547197 = weight(abstract_txt:test in 4457) [ClassicSimilarity], result of:
            0.08547197 = score(doc=4457,freq=1.0), product of:
              0.2698151 = queryWeight, product of:
                3.960538 = boost
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.013441091 = queryNorm
              0.31677982 = fieldWeight in 4457, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.0625 = fieldNorm(doc=4457)
          0.61213994 = weight(abstract_txt:reuters in 4457) [ClassicSimilarity], result of:
            0.61213994 = score(doc=4457,freq=3.0), product of:
              0.74876773 = queryWeight, product of:
                7.3764877 = boost
                7.5520167 = idf(docFreq=60, maxDocs=42740)
                0.013441091 = queryNorm
              0.8175298 = fieldWeight in 4457, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5520167 = idf(docFreq=60, maxDocs=42740)
                0.0625 = fieldNorm(doc=4457)
        0.2 = coord(5/25)
    
  2. Frei, H.P.; Stieger, D.: ¬The use of semantic links in hypertext information retrieval (1995) 0.17
    0.16617136 = sum of:
      0.16617136 = product of:
        0.59346914 = sum of:
          0.045214057 = weight(abstract_txt:addition in 1238) [ClassicSimilarity], result of:
            0.045214057 = score(doc=1238,freq=1.0), product of:
              0.07003676 = queryWeight, product of:
                1.0089139 = boost
                5.1646085 = idf(docFreq=663, maxDocs=42740)
                0.013441091 = queryNorm
              0.64557606 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1646085 = idf(docFreq=663, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.062642574 = weight(abstract_txt:algorithms in 1238) [ClassicSimilarity], result of:
            0.062642574 = score(doc=1238,freq=1.0), product of:
              0.08704092 = queryWeight, product of:
                1.1247418 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.013441091 = queryNorm
              0.7196911 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.04729191 = weight(abstract_txt:retrieval in 1238) [ClassicSimilarity], result of:
            0.04729191 = score(doc=1238,freq=3.0), product of:
              0.06304315 = queryWeight, product of:
                1.3537081 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.013441091 = queryNorm
              0.75015146 = fieldWeight in 1238, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.03286388 = weight(abstract_txt:some in 1238) [ClassicSimilarity], result of:
            0.03286388 = score(doc=1238,freq=1.0), product of:
              0.0713345 = queryWeight, product of:
                1.4399781 = boost
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.013441091 = queryNorm
              0.46070108 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.10151674 = weight(abstract_txt:experiments in 1238) [ClassicSimilarity], result of:
            0.10151674 = score(doc=1238,freq=1.0), product of:
              0.1513023 = queryWeight, product of:
                2.0971467 = boost
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.013441091 = queryNorm
              0.6709531 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.13299607 = weight(abstract_txt:collection in 1238) [ClassicSimilarity], result of:
            0.13299607 = score(doc=1238,freq=1.0), product of:
              0.22823885 = queryWeight, product of:
                3.6426368 = boost
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.013441091 = queryNorm
              0.5827056 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
          0.17094395 = weight(abstract_txt:test in 1238) [ClassicSimilarity], result of:
            0.17094395 = score(doc=1238,freq=1.0), product of:
              0.2698151 = queryWeight, product of:
                3.960538 = boost
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.013441091 = queryNorm
              0.63355964 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.125 = fieldNorm(doc=1238)
        0.28 = coord(7/25)
    
  3. Cathey, R.J.; Jensen, E.C.; Beitzel, S.M.; Frieder, O.; Grossman, D.: Exploiting parallelism to support scalable hierarchical clustering (2007) 0.15
    0.14732997 = sum of:
      0.14732997 = product of:
        0.46040618 = sum of:
          0.027976107 = weight(abstract_txt:significantly in 2449) [ClassicSimilarity], result of:
            0.027976107 = score(doc=2449,freq=1.0), product of:
              0.08072758 = queryWeight, product of:
                1.0831835 = boost
                5.544793 = idf(docFreq=453, maxDocs=42740)
                0.013441091 = queryNorm
              0.34654957 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.544793 = idf(docFreq=453, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.04429499 = weight(abstract_txt:algorithms in 2449) [ClassicSimilarity], result of:
            0.04429499 = score(doc=2449,freq=2.0), product of:
              0.08704092 = queryWeight, product of:
                1.1247418 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.013441091 = queryNorm
              0.50889844 = fieldWeight in 2449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.013651999 = weight(abstract_txt:retrieval in 2449) [ClassicSimilarity], result of:
            0.013651999 = score(doc=2449,freq=1.0), product of:
              0.06304315 = queryWeight, product of:
                1.3537081 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.013441091 = queryNorm
              0.21655008 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.04821133 = weight(abstract_txt:collections in 2449) [ClassicSimilarity], result of:
            0.04821133 = score(doc=2449,freq=2.0), product of:
              0.116036996 = queryWeight, product of:
                1.8365566 = boost
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.013441091 = queryNorm
              0.4154824 = fieldWeight in 2449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.05075837 = weight(abstract_txt:experiments in 2449) [ClassicSimilarity], result of:
            0.05075837 = score(doc=2449,freq=1.0), product of:
              0.1513023 = queryWeight, product of:
                2.0971467 = boost
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.013441091 = queryNorm
              0.33547655 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.07486351 = weight(abstract_txt:larger in 2449) [ClassicSimilarity], result of:
            0.07486351 = score(doc=2449,freq=1.0), product of:
              0.19604413 = queryWeight, product of:
                2.3871682 = boost
                6.109931 = idf(docFreq=257, maxDocs=42740)
                0.013441091 = queryNorm
              0.3818707 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.109931 = idf(docFreq=257, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.11517796 = weight(abstract_txt:collection in 2449) [ClassicSimilarity], result of:
            0.11517796 = score(doc=2449,freq=3.0), product of:
              0.22823885 = queryWeight, product of:
                3.6426368 = boost
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.013441091 = queryNorm
              0.50463784 = fieldWeight in 2449, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
          0.08547197 = weight(abstract_txt:test in 2449) [ClassicSimilarity], result of:
            0.08547197 = score(doc=2449,freq=1.0), product of:
              0.2698151 = queryWeight, product of:
                3.960538 = boost
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.013441091 = queryNorm
              0.31677982 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.0625 = fieldNorm(doc=2449)
        0.32 = coord(8/25)
    
  4. Sparck Jones, K.; Rijsbergen, C.J. van: Progress in documentation : Information retrieval test collection (1976) 0.12
    0.122820914 = sum of:
      0.122820914 = product of:
        0.61410457 = sum of:
          0.028960263 = weight(abstract_txt:retrieval in 4230) [ClassicSimilarity], result of:
            0.028960263 = score(doc=4230,freq=2.0), product of:
              0.06304315 = queryWeight, product of:
                1.3537081 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.013441091 = queryNorm
              0.4593721 = fieldWeight in 4230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.09375 = fieldNorm(doc=4230)
          0.11434321 = weight(abstract_txt:collections in 4230) [ClassicSimilarity], result of:
            0.11434321 = score(doc=4230,freq=5.0), product of:
              0.116036996 = queryWeight, product of:
                1.8365566 = boost
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.013441091 = queryNorm
              0.98540306 = fieldWeight in 4230, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.09375 = fieldNorm(doc=4230)
          0.10767476 = weight(abstract_txt:experiments in 4230) [ClassicSimilarity], result of:
            0.10767476 = score(doc=4230,freq=2.0), product of:
              0.1513023 = queryWeight, product of:
                2.0971467 = boost
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.013441091 = queryNorm
              0.71165323 = fieldWeight in 4230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3676248 = idf(docFreq=541, maxDocs=42740)
                0.09375 = fieldNorm(doc=4230)
          0.14106362 = weight(abstract_txt:collection in 4230) [ClassicSimilarity], result of:
            0.14106362 = score(doc=4230,freq=2.0), product of:
              0.22823885 = queryWeight, product of:
                3.6426368 = boost
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.013441091 = queryNorm
              0.6180526 = fieldWeight in 4230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.09375 = fieldNorm(doc=4230)
          0.2220627 = weight(abstract_txt:test in 4230) [ClassicSimilarity], result of:
            0.2220627 = score(doc=4230,freq=3.0), product of:
              0.2698151 = queryWeight, product of:
                3.960538 = boost
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.013441091 = queryNorm
              0.82301813 = fieldWeight in 4230, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.09375 = fieldNorm(doc=4230)
        0.2 = coord(5/25)
    
  5. Witschel, H.F.: Global term weights in distributed environments (2008) 0.12
    0.12256245 = sum of:
      0.12256245 = product of:
        0.43772304 = sum of:
          0.028258786 = weight(abstract_txt:addition in 4097) [ClassicSimilarity], result of:
            0.028258786 = score(doc=4097,freq=1.0), product of:
              0.07003676 = queryWeight, product of:
                1.0089139 = boost
                5.1646085 = idf(docFreq=663, maxDocs=42740)
                0.013441091 = queryNorm
              0.40348503 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1646085 = idf(docFreq=663, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.039096117 = weight(abstract_txt:derived in 4097) [ClassicSimilarity], result of:
            0.039096117 = score(doc=4097,freq=1.0), product of:
              0.086958654 = queryWeight, product of:
                1.1242101 = boost
                5.7548075 = idf(docFreq=367, maxDocs=42740)
                0.013441091 = queryNorm
              0.44959432 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7548075 = idf(docFreq=367, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.03413 = weight(abstract_txt:retrieval in 4097) [ClassicSimilarity], result of:
            0.03413 = score(doc=4097,freq=4.0), product of:
              0.06304315 = queryWeight, product of:
                1.3537081 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.013441091 = queryNorm
              0.5413752 = fieldWeight in 4097, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.020539926 = weight(abstract_txt:some in 4097) [ClassicSimilarity], result of:
            0.020539926 = score(doc=4097,freq=1.0), product of:
              0.0713345 = queryWeight, product of:
                1.4399781 = boost
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.013441091 = queryNorm
              0.28793818 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6856086 = idf(docFreq=2913, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.0426132 = weight(abstract_txt:collections in 4097) [ClassicSimilarity], result of:
            0.0426132 = score(doc=4097,freq=1.0), product of:
              0.116036996 = queryWeight, product of:
                1.8365566 = boost
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.013441091 = queryNorm
              0.36723804 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.700647 = idf(docFreq=1055, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.16624507 = weight(abstract_txt:collection in 4097) [ClassicSimilarity], result of:
            0.16624507 = score(doc=4097,freq=4.0), product of:
              0.22823885 = queryWeight, product of:
                3.6426368 = boost
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.013441091 = queryNorm
              0.728382 = fieldWeight in 4097, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.661645 = idf(docFreq=1097, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
          0.10683997 = weight(abstract_txt:test in 4097) [ClassicSimilarity], result of:
            0.10683997 = score(doc=4097,freq=1.0), product of:
              0.2698151 = queryWeight, product of:
                3.960538 = boost
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.013441091 = queryNorm
              0.39597479 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.068477 = idf(docFreq=730, maxDocs=42740)
                0.078125 = fieldNorm(doc=4097)
        0.28 = coord(7/25)