Document (#2332)

Author
Salton, G.
Title
Fast document classification in automatic information retrieval
Source
Kooperation in der Klassifikation I. Proc. der Sekt.1-3 der 2. Fachtagung der Gesellschaft für Klassifikation, Frankfurt-Hoechst, 6.-7.4.1978. Bearb.: W. Dahlberg
Imprint
Frankfurt : Gesellschaft für Klassifikation
Year
1978
Pages
S.129-146
Series
Studien zur Klassifikation; Bd.2
Abstract
A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
Theme
Automatisches Indexieren

Similar documents (author)

  1. Salton, G.: Another look at automatic text-retrieval systems (1986) 4.84
    4.844292 = sum of:
      4.844292 = weight(author_txt:salton in 1356) [ClassicSimilarity], result of:
        4.844292 = fieldWeight in 1356, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7508674 = idf(docFreq=49, maxDocs=42740)
          0.625 = fieldNorm(doc=1356)
    
  2. Salton, G.: ¬A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART) (1972) 4.84
    4.844292 = sum of:
      4.844292 = weight(author_txt:salton in 2325) [ClassicSimilarity], result of:
        4.844292 = fieldWeight in 2325, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7508674 = idf(docFreq=49, maxDocs=42740)
          0.625 = fieldNorm(doc=2325)
    
  3. Salton, G.: Future prospects for text-based information retrieval (1990) 4.84
    4.844292 = sum of:
      4.844292 = weight(author_txt:salton in 2327) [ClassicSimilarity], result of:
        4.844292 = fieldWeight in 2327, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7508674 = idf(docFreq=49, maxDocs=42740)
          0.625 = fieldNorm(doc=2327)
    
  4. Salton, G.: Expert systems and information retrieval (1987) 4.84
    4.844292 = sum of:
      4.844292 = weight(author_txt:salton in 2837) [ClassicSimilarity], result of:
        4.844292 = fieldWeight in 2837, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7508674 = idf(docFreq=49, maxDocs=42740)
          0.625 = fieldNorm(doc=2837)
    
  5. Salton, G.: Historical note: the past thirty years in information retrieval (1987) 4.84
    4.844292 = sum of:
      4.844292 = weight(author_txt:salton in 3910) [ClassicSimilarity], result of:
        4.844292 = fieldWeight in 3910, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7508674 = idf(docFreq=49, maxDocs=42740)
          0.625 = fieldNorm(doc=3910)
    

Similar documents (content)

  1. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2012) 0.17
    0.16704923 = sum of:
      0.16704923 = product of:
        0.6960385 = sum of:
          0.041888956 = weight(abstract_txt:appropriate in 2311) [ClassicSimilarity], result of:
            0.041888956 = score(doc=2311,freq=1.0), product of:
              0.100603715 = queryWeight, product of:
                1.0117666 = boost
                5.329611 = idf(docFreq=562, maxDocs=42740)
                0.018656844 = queryNorm
              0.41637585 = fieldWeight in 2311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.329611 = idf(docFreq=562, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
          0.042057298 = weight(abstract_txt:simple in 2311) [ClassicSimilarity], result of:
            0.042057298 = score(doc=2311,freq=1.0), product of:
              0.10087307 = queryWeight, product of:
                1.01312 = boost
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.018656844 = queryNorm
              0.41693288 = fieldWeight in 2311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
          0.054611653 = weight(abstract_txt:fast in 2311) [ClassicSimilarity], result of:
            0.054611653 = score(doc=2311,freq=1.0), product of:
              0.12006171 = queryWeight, product of:
                1.1052883 = boost
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.018656844 = queryNorm
              0.4548632 = fieldWeight in 2311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
          0.026944466 = weight(abstract_txt:search in 2311) [ClassicSimilarity], result of:
            0.026944466 = score(doc=2311,freq=1.0), product of:
              0.09445045 = queryWeight, product of:
                1.3864057 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.018656844 = queryNorm
              0.2852762 = fieldWeight in 2311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
          0.10300797 = weight(abstract_txt:files in 2311) [ClassicSimilarity], result of:
            0.10300797 = score(doc=2311,freq=1.0), product of:
              0.2309253 = queryWeight, product of:
                2.1678243 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.018656844 = queryNorm
              0.4460662 = fieldWeight in 2311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
          0.42752817 = weight(abstract_txt:file in 2311) [ClassicSimilarity], result of:
            0.42752817 = score(doc=2311,freq=5.0), product of:
              0.43942773 = queryWeight, product of:
                4.229091 = boost
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.018656844 = queryNorm
              0.9729203 = fieldWeight in 2311, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.078125 = fieldNorm(doc=2311)
        0.24 = coord(6/25)
    
  2. Lee, D.L.; Ren, L.: Document ranking on weight-partitioned signature files (1996) 0.16
    0.15942986 = sum of:
      0.15942986 = product of:
        0.7971493 = sum of:
          0.0485333 = weight(abstract_txt:together in 3418) [ClassicSimilarity], result of:
            0.0485333 = score(doc=3418,freq=1.0), product of:
              0.09827734 = queryWeight, product of:
                5.267629 = idf(docFreq=598, maxDocs=42740)
                0.018656844 = queryNorm
              0.49384022 = fieldWeight in 3418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.267629 = idf(docFreq=598, maxDocs=42740)
                0.09375 = fieldNorm(doc=3418)
          0.032333363 = weight(abstract_txt:search in 3418) [ClassicSimilarity], result of:
            0.032333363 = score(doc=3418,freq=1.0), product of:
              0.09445045 = queryWeight, product of:
                1.3864057 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.018656844 = queryNorm
              0.34233147 = fieldWeight in 3418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.09375 = fieldNorm(doc=3418)
          0.13380173 = weight(abstract_txt:grouped in 3418) [ClassicSimilarity], result of:
            0.13380173 = score(doc=3418,freq=1.0), product of:
              0.19322707 = queryWeight, product of:
                1.4021914 = boost
                7.3862243 = idf(docFreq=71, maxDocs=42740)
                0.018656844 = queryNorm
              0.6924585 = fieldWeight in 3418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3862243 = idf(docFreq=71, maxDocs=42740)
                0.09375 = fieldNorm(doc=3418)
          0.12360956 = weight(abstract_txt:files in 3418) [ClassicSimilarity], result of:
            0.12360956 = score(doc=3418,freq=1.0), product of:
              0.2309253 = queryWeight, product of:
                2.1678243 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.018656844 = queryNorm
              0.5352794 = fieldWeight in 3418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.09375 = fieldNorm(doc=3418)
          0.45887136 = weight(abstract_txt:file in 3418) [ClassicSimilarity], result of:
            0.45887136 = score(doc=3418,freq=4.0), product of:
              0.43942773 = queryWeight, product of:
                4.229091 = boost
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.018656844 = queryNorm
              1.0442476 = fieldWeight in 3418, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.09375 = fieldNorm(doc=3418)
        0.2 = coord(5/25)
    
  3. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2014) 0.13
    0.13381882 = sum of:
      0.13381882 = product of:
        0.6690941 = sum of:
          0.041888956 = weight(abstract_txt:appropriate in 3971) [ClassicSimilarity], result of:
            0.041888956 = score(doc=3971,freq=1.0), product of:
              0.100603715 = queryWeight, product of:
                1.0117666 = boost
                5.329611 = idf(docFreq=562, maxDocs=42740)
                0.018656844 = queryNorm
              0.41637585 = fieldWeight in 3971, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.329611 = idf(docFreq=562, maxDocs=42740)
                0.078125 = fieldNorm(doc=3971)
          0.042057298 = weight(abstract_txt:simple in 3971) [ClassicSimilarity], result of:
            0.042057298 = score(doc=3971,freq=1.0), product of:
              0.10087307 = queryWeight, product of:
                1.01312 = boost
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.018656844 = queryNorm
              0.41693288 = fieldWeight in 3971, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.078125 = fieldNorm(doc=3971)
          0.054611653 = weight(abstract_txt:fast in 3971) [ClassicSimilarity], result of:
            0.054611653 = score(doc=3971,freq=1.0), product of:
              0.12006171 = queryWeight, product of:
                1.1052883 = boost
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.018656844 = queryNorm
              0.4548632 = fieldWeight in 3971, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.078125 = fieldNorm(doc=3971)
          0.10300797 = weight(abstract_txt:files in 3971) [ClassicSimilarity], result of:
            0.10300797 = score(doc=3971,freq=1.0), product of:
              0.2309253 = queryWeight, product of:
                2.1678243 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.018656844 = queryNorm
              0.4460662 = fieldWeight in 3971, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.078125 = fieldNorm(doc=3971)
          0.42752817 = weight(abstract_txt:file in 3971) [ClassicSimilarity], result of:
            0.42752817 = score(doc=3971,freq=5.0), product of:
              0.43942773 = queryWeight, product of:
                4.229091 = boost
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.018656844 = queryNorm
              0.9729203 = fieldWeight in 3971, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5693207 = idf(docFreq=442, maxDocs=42740)
                0.078125 = fieldNorm(doc=3971)
        0.2 = coord(5/25)
    
  4. Zamir, O.; Etzioni, O.: Grouper : a dynamic clustering interface to Web search results (1999) 0.13
    0.1261367 = sum of:
      0.1261367 = product of:
        0.5255696 = sum of:
          0.042057298 = weight(abstract_txt:simple in 208) [ClassicSimilarity], result of:
            0.042057298 = score(doc=208,freq=1.0), product of:
              0.10087307 = queryWeight, product of:
                1.01312 = boost
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.018656844 = queryNorm
              0.41693288 = fieldWeight in 208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.336741 = idf(docFreq=558, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
          0.054611653 = weight(abstract_txt:fast in 208) [ClassicSimilarity], result of:
            0.054611653 = score(doc=208,freq=1.0), product of:
              0.12006171 = queryWeight, product of:
                1.1052883 = boost
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.018656844 = queryNorm
              0.4548632 = fieldWeight in 208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.822249 = idf(docFreq=343, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
          0.13320276 = weight(abstract_txt:clustering in 208) [ClassicSimilarity], result of:
            0.13320276 = score(doc=208,freq=4.0), product of:
              0.13704708 = queryWeight, product of:
                1.1808866 = boost
                6.220473 = idf(docFreq=230, maxDocs=42740)
                0.018656844 = queryNorm
              0.97194886 = fieldWeight in 208, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.220473 = idf(docFreq=230, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
          0.13327016 = weight(abstract_txt:clusters in 208) [ClassicSimilarity], result of:
            0.13327016 = score(doc=208,freq=3.0), product of:
              0.15089071 = queryWeight, product of:
                1.2390949 = boost
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.018656844 = queryNorm
              0.88322306 = fieldWeight in 208, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
          0.13548325 = weight(abstract_txt:cluster in 208) [ClassicSimilarity], result of:
            0.13548325 = score(doc=208,freq=3.0), product of:
              0.15255658 = queryWeight, product of:
                1.2459161 = boost
                6.563024 = idf(docFreq=163, maxDocs=42740)
                0.018656844 = queryNorm
              0.88808525 = fieldWeight in 208, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.563024 = idf(docFreq=163, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
          0.026944466 = weight(abstract_txt:search in 208) [ClassicSimilarity], result of:
            0.026944466 = score(doc=208,freq=1.0), product of:
              0.09445045 = queryWeight, product of:
                1.3864057 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.018656844 = queryNorm
              0.2852762 = fieldWeight in 208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.078125 = fieldNorm(doc=208)
        0.24 = coord(6/25)
    
  5. Rasmussen, E.: Clustering algorithms (1992) 0.12
    0.11915177 = sum of:
      0.11915177 = product of:
        0.59575886 = sum of:
          0.0771366 = weight(abstract_txt:items in 4514) [ClassicSimilarity], result of:
            0.0771366 = score(doc=4514,freq=4.0), product of:
              0.11048619 = queryWeight, product of:
                1.0602964 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.018656844 = queryNorm
              0.69815606 = fieldWeight in 4514, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.08705169 = weight(abstract_txt:clusters in 4514) [ClassicSimilarity], result of:
            0.08705169 = score(doc=4514,freq=2.0), product of:
              0.15089071 = queryWeight, product of:
                1.2390949 = boost
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.018656844 = queryNorm
              0.57691884 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.13992651 = weight(abstract_txt:cluster in 4514) [ClassicSimilarity], result of:
            0.13992651 = score(doc=4514,freq=5.0), product of:
              0.15255658 = queryWeight, product of:
                1.2459161 = boost
                6.563024 = idf(docFreq=163, maxDocs=42740)
                0.018656844 = queryNorm
              0.9172105 = fieldWeight in 4514, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.563024 = idf(docFreq=163, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.078768805 = weight(abstract_txt:methods in 4514) [ClassicSimilarity], result of:
            0.078768805 = score(doc=4514,freq=6.0), product of:
              0.12331523 = queryWeight, product of:
                1.5841514 = boost
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.018656844 = queryNorm
              0.63875973 = fieldWeight in 4514, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.21287523 = weight(abstract_txt:clustered in 4514) [ClassicSimilarity], result of:
            0.21287523 = score(doc=4514,freq=1.0), product of:
              0.43475816 = queryWeight, product of:
                2.9744878 = boost
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.018656844 = queryNorm
              0.48964056 = fieldWeight in 4514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
        0.2 = coord(5/25)