Search (2 results, page 1 of 1)

Beaulieu, M.; Robertson, S.; Rasmussen, E.: Evaluating interactive systems in TREC (1996) 0.01

0.005486114 = product of:
  0.010972228 = sum of:
    0.010972228 = product of:
      0.021944456 = sum of:
        0.021944456 = weight(_text_:m in 2998) [ClassicSimilarity], result of:
          0.021944456 = score(doc=2998,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.19245613 = fieldWeight in 2998, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2998)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Rasmussen, E.: Clustering algorithms (1992) 0.00
```
0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 3513) [ClassicSimilarity], result of:
          0.015674612 = score(doc=3513,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 3513, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3513)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Cluster analysis is a technique for multivariate analysis that assigns items to automatically created groups based on a calculation of the degree of association between items and groups. In the information retrieval field, cluster analysis has been used to create groups of documents with the goal of improving the effenciency and effectiveness of retrieval, or to determine the structure of the literature of a field. The terms in a document collection can also be clustered to show their relationships. The two main types of cluster analysis methods are the nonhierarchical, which divide a data set of N items into M clusters, and the hierarchical, which produce a nested data set in which pairs of items or clusters are successively linked. The nonhierarchical methods such as the single pass and reallocation methods are heuristic in nature and require less computation than the hierarchical methods. However, the hierarchical methods have usually been preferred for cluster-based document retrieval. The commonly used hierarchical methods, such as single link, complete link, group average link, and Ward's method, have high space and time requirements. In order to cluster the large data sets with high dimensionality that are typically found in IR applications, good algorithms (ideally O(N**2) time, O(N) space) must be found. Examples are the SLINK and minimal spanning tree algorithms for the single link method, the Voorhees algorithm for group average linlk, and the reciprocal nearest neighbor algorithm for Ward's method

Search (2 results, page 1 of 1)

Authors