Document (#5818)

Author
Shaw, R.J.
Willett, P.
Title
On the non-random nature of nearest-neighbour document clusters
Source
Information processing and management. 29(1993) no.4, S.449-452
Year
1993
Abstract
It has been suggested that the observed values of retrieval effectiveness that are obtained in searches of files of nearest-neighbour clusters can be explained by assuming that the pairwise inter-document similarities used to construct the clusters have been generated randomly. Such similarities are significantly different from those obtained by a random generation procedure

Similar documents (author)

  1. Shaw, R.R.: Classification systems (1962/63) 1.78
    1.7751309 = sum of:
      1.7751309 = product of:
        3.5502617 = sum of:
          3.5502617 = weight(author_txt:shaw in 603) [ClassicSimilarity], result of:
            3.5502617 = score(doc=603,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.08802168 = queryNorm
              5.0208282 = fieldWeight in 603, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.625 = fieldNorm(doc=603)
        0.5 = coord(1/2)
    
  2. Willett, P.: Recent trends in hierarchic document clustering : a critical review (1988) 1.78
    1.7751309 = sum of:
      1.7751309 = product of:
        3.5502617 = sum of:
          3.5502617 = weight(author_txt:willett in 2604) [ClassicSimilarity], result of:
            3.5502617 = score(doc=2604,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.08802168 = queryNorm
              5.0208282 = fieldWeight in 2604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.625 = fieldNorm(doc=2604)
        0.5 = coord(1/2)
    
  3. Shaw, W.M.: Subject and citation indexing : pt.1: the clustering structure of composite representations in the cystic fibrosis document collection (1991) 1.78
    1.7751309 = sum of:
      1.7751309 = product of:
        3.5502617 = sum of:
          3.5502617 = weight(author_txt:shaw in 4841) [ClassicSimilarity], result of:
            3.5502617 = score(doc=4841,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.08802168 = queryNorm
              5.0208282 = fieldWeight in 4841, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.625 = fieldNorm(doc=4841)
        0.5 = coord(1/2)
    
  4. Shaw, W.M.: Subject and citation indexing : pt.2: the optimal, cluster-based retrieval performance of composite representations (1991) 1.78
    1.7751309 = sum of:
      1.7751309 = product of:
        3.5502617 = sum of:
          3.5502617 = weight(author_txt:shaw in 4842) [ClassicSimilarity], result of:
            3.5502617 = score(doc=4842,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.08802168 = queryNorm
              5.0208282 = fieldWeight in 4842, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.625 = fieldNorm(doc=4842)
        0.5 = coord(1/2)
    
  5. Willett, P.: Best-match text retrieval (1993) 1.78
    1.7751309 = sum of:
      1.7751309 = product of:
        3.5502617 = sum of:
          3.5502617 = weight(author_txt:willett in 7818) [ClassicSimilarity], result of:
            3.5502617 = score(doc=7818,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.08802168 = queryNorm
              5.0208282 = fieldWeight in 7818, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.625 = fieldNorm(doc=7818)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Sembok, T.M.T.; Rijsbergen, C.J. van: IMAGING: a relevant feedback retrieval with nearest neighbour clusters (1994) 0.29
    0.28803223 = sum of:
      0.28803223 = product of:
        1.8002015 = sum of:
          0.17822435 = weight(abstract_txt:obtained in 1071) [ClassicSimilarity], result of:
            0.17822435 = score(doc=1071,freq=1.0), product of:
              0.19814004 = queryWeight, product of:
                2.2582538 = boost
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.015241395 = queryNorm
              0.89948684 = fieldWeight in 1071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.15625 = fieldNorm(doc=1071)
          0.50415295 = weight(abstract_txt:nearest in 1071) [ClassicSimilarity], result of:
            0.50415295 = score(doc=1071,freq=1.0), product of:
              0.39631063 = queryWeight, product of:
                3.1937761 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.015241395 = queryNorm
              1.2721156 = fieldWeight in 1071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.15625 = fieldNorm(doc=1071)
          0.7303008 = weight(abstract_txt:neighbour in 1071) [ClassicSimilarity], result of:
            0.7303008 = score(doc=1071,freq=1.0), product of:
              0.5073746 = queryWeight, product of:
                3.6136906 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015241395 = queryNorm
              1.4393721 = fieldWeight in 1071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.15625 = fieldNorm(doc=1071)
          0.3875234 = weight(abstract_txt:clusters in 1071) [ClassicSimilarity], result of:
            0.3875234 = score(doc=1071,freq=1.0), product of:
              0.38067695 = queryWeight, product of:
                3.833633 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.015241395 = queryNorm
              1.017985 = fieldWeight in 1071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.15625 = fieldNorm(doc=1071)
        0.16 = coord(4/25)
    
  2. Mohan, K.C.: Boolean and nearest neighbour text searching in a multi-strategy retrieval system (1996) 0.21
    0.20600453 = sum of:
      0.20600453 = product of:
        1.0300226 = sum of:
          0.043331817 = weight(abstract_txt:effectiveness in 7255) [ClassicSimilarity], result of:
            0.043331817 = score(doc=7255,freq=1.0), product of:
              0.0777064 = queryWeight, product of:
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.015241395 = queryNorm
              0.5576351 = fieldWeight in 7255, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.109375 = fieldNorm(doc=7255)
          0.091613635 = weight(abstract_txt:explained in 7255) [ClassicSimilarity], result of:
            0.091613635 = score(doc=7255,freq=1.0), product of:
              0.1280046 = queryWeight, product of:
                1.2834661 = boost
                6.543596 = idf(docFreq=172, maxDocs=44218)
                0.015241395 = queryNorm
              0.7157058 = fieldWeight in 7255, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.543596 = idf(docFreq=172, maxDocs=44218)
                0.109375 = fieldNorm(doc=7255)
          0.030959561 = weight(abstract_txt:been in 7255) [ClassicSimilarity], result of:
            0.030959561 = score(doc=7255,freq=1.0), product of:
              0.07824538 = queryWeight, product of:
                1.4191097 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.015241395 = queryNorm
              0.3956727 = fieldWeight in 7255, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.109375 = fieldNorm(doc=7255)
          0.35290703 = weight(abstract_txt:nearest in 7255) [ClassicSimilarity], result of:
            0.35290703 = score(doc=7255,freq=1.0), product of:
              0.39631063 = queryWeight, product of:
                3.1937761 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.015241395 = queryNorm
              0.8904809 = fieldWeight in 7255, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.109375 = fieldNorm(doc=7255)
          0.5112105 = weight(abstract_txt:neighbour in 7255) [ClassicSimilarity], result of:
            0.5112105 = score(doc=7255,freq=1.0), product of:
              0.5073746 = queryWeight, product of:
                3.6136906 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015241395 = queryNorm
              1.0075604 = fieldWeight in 7255, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.109375 = fieldNorm(doc=7255)
        0.2 = coord(5/25)
    
  3. Small, H.G.: Structural dynamics of scientific literature (2015) 0.20
    0.19862683 = sum of:
      0.19862683 = product of:
        0.7093815 = sum of:
          0.051937986 = weight(abstract_txt:observed in 2356) [ClassicSimilarity], result of:
            0.051937986 = score(doc=2356,freq=1.0), product of:
              0.109730564 = queryWeight, product of:
                1.1883255 = boost
                6.0585327 = idf(docFreq=280, maxDocs=44218)
                0.015241395 = queryNorm
              0.47332287 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0585327 = idf(docFreq=280, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.0666853 = weight(abstract_txt:procedure in 2356) [ClassicSimilarity], result of:
            0.0666853 = score(doc=2356,freq=1.0), product of:
              0.12962565 = queryWeight, product of:
                1.2915674 = boost
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.015241395 = queryNorm
              0.51444525 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.009320955 = weight(abstract_txt:that in 2356) [ClassicSimilarity], result of:
            0.009320955 = score(doc=2356,freq=1.0), product of:
              0.05035217 = queryWeight, product of:
                1.3942522 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015241395 = queryNorm
              0.18511525 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.022113971 = weight(abstract_txt:been in 2356) [ClassicSimilarity], result of:
            0.022113971 = score(doc=2356,freq=1.0), product of:
              0.07824538 = queryWeight, product of:
                1.4191097 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.015241395 = queryNorm
              0.28262335 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.03694677 = weight(abstract_txt:document in 2356) [ClassicSimilarity], result of:
            0.03694677 = score(doc=2356,freq=1.0), product of:
              0.11017047 = queryWeight, product of:
                1.6839113 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015241395 = queryNorm
              0.33536002 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.08911218 = weight(abstract_txt:obtained in 2356) [ClassicSimilarity], result of:
            0.08911218 = score(doc=2356,freq=1.0), product of:
              0.19814004 = queryWeight, product of:
                2.2582538 = boost
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.015241395 = queryNorm
              0.44974342 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
          0.43326437 = weight(abstract_txt:clusters in 2356) [ClassicSimilarity], result of:
            0.43326437 = score(doc=2356,freq=5.0), product of:
              0.38067695 = queryWeight, product of:
                3.833633 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.015241395 = queryNorm
              1.1381419 = fieldWeight in 2356, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.078125 = fieldNorm(doc=2356)
        0.28 = coord(7/25)
    
  4. Al-Hawamdeh, S.; Smith, G.; Willett, P.; Vere, R. de: Using nearest-neighbour searching techniques to access full-text documents (1991) 0.15
    0.14862278 = sum of:
      0.14862278 = product of:
        0.9288924 = sum of:
          0.013049337 = weight(abstract_txt:that in 2300) [ClassicSimilarity], result of:
            0.013049337 = score(doc=2300,freq=1.0), product of:
              0.05035217 = queryWeight, product of:
                1.3942522 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015241395 = queryNorm
              0.25916135 = fieldWeight in 2300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.109375 = fieldNorm(doc=2300)
          0.05172548 = weight(abstract_txt:document in 2300) [ClassicSimilarity], result of:
            0.05172548 = score(doc=2300,freq=1.0), product of:
              0.11017047 = queryWeight, product of:
                1.6839113 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015241395 = queryNorm
              0.46950403 = fieldWeight in 2300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.109375 = fieldNorm(doc=2300)
          0.35290703 = weight(abstract_txt:nearest in 2300) [ClassicSimilarity], result of:
            0.35290703 = score(doc=2300,freq=1.0), product of:
              0.39631063 = queryWeight, product of:
                3.1937761 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.015241395 = queryNorm
              0.8904809 = fieldWeight in 2300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.109375 = fieldNorm(doc=2300)
          0.5112105 = weight(abstract_txt:neighbour in 2300) [ClassicSimilarity], result of:
            0.5112105 = score(doc=2300,freq=1.0), product of:
              0.5073746 = queryWeight, product of:
                3.6136906 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015241395 = queryNorm
              1.0075604 = fieldWeight in 2300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.109375 = fieldNorm(doc=2300)
        0.16 = coord(4/25)
    
  5. Rasmussen, E.: Clustering algorithms (1992) 0.13
    0.12552086 = sum of:
      0.12552086 = product of:
        0.5230036 = sum of:
          0.024761038 = weight(abstract_txt:effectiveness in 3513) [ClassicSimilarity], result of:
            0.024761038 = score(doc=3513,freq=1.0), product of:
              0.0777064 = queryWeight, product of:
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.015241395 = queryNorm
              0.31864864 = fieldWeight in 3513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.010545456 = weight(abstract_txt:that in 3513) [ClassicSimilarity], result of:
            0.010545456 = score(doc=3513,freq=2.0), product of:
              0.05035217 = queryWeight, product of:
                1.3942522 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015241395 = queryNorm
              0.20943399 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.025019104 = weight(abstract_txt:been in 3513) [ClassicSimilarity], result of:
            0.025019104 = score(doc=3513,freq=2.0), product of:
              0.07824538 = queryWeight, product of:
                1.4191097 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.015241395 = queryNorm
              0.31975183 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.0418005 = weight(abstract_txt:document in 3513) [ClassicSimilarity], result of:
            0.0418005 = score(doc=3513,freq=2.0), product of:
              0.11017047 = queryWeight, product of:
                1.6839113 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015241395 = queryNorm
              0.37941656 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.20166117 = weight(abstract_txt:nearest in 3513) [ClassicSimilarity], result of:
            0.20166117 = score(doc=3513,freq=1.0), product of:
              0.39631063 = queryWeight, product of:
                3.1937761 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.015241395 = queryNorm
              0.5088462 = fieldWeight in 3513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.21921635 = weight(abstract_txt:clusters in 3513) [ClassicSimilarity], result of:
            0.21921635 = score(doc=3513,freq=2.0), product of:
              0.38067695 = queryWeight, product of:
                3.833633 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.015241395 = queryNorm
              0.57585925 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
        0.24 = coord(6/25)