Document (#2332)

Author
Salton, G.
Title
Fast document classification in automatic information retrieval
Source
Kooperation in der Klassifikation I. Proc. der Sekt.1-3 der 2. Fachtagung der Gesellschaft für Klassifikation, Frankfurt-Hoechst, 6.-7.4.1978. Bearb.: W. Dahlberg
Imprint
Frankfurt : Gesellschaft für Klassifikation
Year
1978
Pages
S.129-146
Series
Studien zur Klassifikation; Bd.2
Abstract
A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
Theme
Automatisches Indexieren

Similar documents (author)

  1. Salton, G.: Another look at automatic text-retrieval systems (1986) 4.87
    4.8655405 = sum of:
      4.8655405 = weight(author_txt:salton in 1356) [ClassicSimilarity], result of:
        4.8655405 = fieldWeight in 1356, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7848644 = idf(docFreq=49, maxDocs=44218)
          0.625 = fieldNorm(doc=1356)
    
  2. Salton, G.: ¬A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART) (1972) 4.87
    4.8655405 = sum of:
      4.8655405 = weight(author_txt:salton in 2325) [ClassicSimilarity], result of:
        4.8655405 = fieldWeight in 2325, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7848644 = idf(docFreq=49, maxDocs=44218)
          0.625 = fieldNorm(doc=2325)
    
  3. Salton, G.: Future prospects for text-based information retrieval (1990) 4.87
    4.8655405 = sum of:
      4.8655405 = weight(author_txt:salton in 2327) [ClassicSimilarity], result of:
        4.8655405 = fieldWeight in 2327, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7848644 = idf(docFreq=49, maxDocs=44218)
          0.625 = fieldNorm(doc=2327)
    
  4. Salton, G.: Expert systems and information retrieval (1987) 4.87
    4.8655405 = sum of:
      4.8655405 = weight(author_txt:salton in 2837) [ClassicSimilarity], result of:
        4.8655405 = fieldWeight in 2837, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7848644 = idf(docFreq=49, maxDocs=44218)
          0.625 = fieldNorm(doc=2837)
    
  5. Salton, G.: Historical note: the past thirty years in information retrieval (1987) 4.87
    4.8655405 = sum of:
      4.8655405 = weight(author_txt:salton in 3910) [ClassicSimilarity], result of:
        4.8655405 = fieldWeight in 3910, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.7848644 = idf(docFreq=49, maxDocs=44218)
          0.625 = fieldNorm(doc=3910)
    

Similar documents (content)

  1. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2012) 0.17
    0.16736847 = sum of:
      0.16736847 = product of:
        0.6973686 = sum of:
          0.041670714 = weight(abstract_txt:appropriate in 310) [ClassicSimilarity], result of:
            0.041670714 = score(doc=310,freq=1.0), product of:
              0.100257345 = queryWeight, product of:
                1.0125256 = boost
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.018611675 = queryNorm
              0.41563752 = fieldWeight in 310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
          0.041710716 = weight(abstract_txt:simple in 310) [ClassicSimilarity], result of:
            0.041710716 = score(doc=310,freq=1.0), product of:
              0.100321494 = queryWeight, product of:
                1.0128495 = boost
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.018611675 = queryNorm
              0.41577047 = fieldWeight in 310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
          0.0535307 = weight(abstract_txt:fast in 310) [ClassicSimilarity], result of:
            0.0535307 = score(doc=310,freq=1.0), product of:
              0.11847612 = queryWeight, product of:
                1.1006857 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.018611675 = queryNorm
              0.4518269 = fieldWeight in 310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
          0.027091717 = weight(abstract_txt:search in 310) [ClassicSimilarity], result of:
            0.027091717 = score(doc=310,freq=1.0), product of:
              0.09479743 = queryWeight, product of:
                1.392391 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018611675 = queryNorm
              0.28578535 = fieldWeight in 310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
          0.10360888 = weight(abstract_txt:files in 310) [ClassicSimilarity], result of:
            0.10360888 = score(doc=310,freq=1.0), product of:
              0.2318303 = queryWeight, product of:
                2.177449 = boost
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.018611675 = queryNorm
              0.44691688 = fieldWeight in 310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
          0.4297559 = weight(abstract_txt:file in 310) [ClassicSimilarity], result of:
            0.4297559 = score(doc=310,freq=5.0), product of:
              0.44096768 = queryWeight, product of:
                4.24699 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018611675 = queryNorm
              0.97457457 = fieldWeight in 310, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.078125 = fieldNorm(doc=310)
        0.24 = coord(6/25)
    
  2. Lee, D.L.; Ren, L.: Document ranking on weight-partitioned signature files (1996) 0.16
    0.15965942 = sum of:
      0.15965942 = product of:
        0.79829705 = sum of:
          0.04817195 = weight(abstract_txt:together in 2417) [ClassicSimilarity], result of:
            0.04817195 = score(doc=2417,freq=1.0), product of:
              0.09779219 = queryWeight, product of:
                5.254347 = idf(docFreq=627, maxDocs=44218)
                0.018611675 = queryNorm
              0.49259502 = fieldWeight in 2417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.254347 = idf(docFreq=627, maxDocs=44218)
                0.09375 = fieldNorm(doc=2417)
          0.03251006 = weight(abstract_txt:search in 2417) [ClassicSimilarity], result of:
            0.03251006 = score(doc=2417,freq=1.0), product of:
              0.09479743 = queryWeight, product of:
                1.392391 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018611675 = queryNorm
              0.34294242 = fieldWeight in 2417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.09375 = fieldNorm(doc=2417)
          0.13202196 = weight(abstract_txt:grouped in 2417) [ClassicSimilarity], result of:
            0.13202196 = score(doc=2417,freq=1.0), product of:
              0.19151619 = queryWeight, product of:
                1.3994282 = boost
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.018611675 = queryNorm
              0.68935144 = fieldWeight in 2417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.09375 = fieldNorm(doc=2417)
          0.12433066 = weight(abstract_txt:files in 2417) [ClassicSimilarity], result of:
            0.12433066 = score(doc=2417,freq=1.0), product of:
              0.2318303 = queryWeight, product of:
                2.177449 = boost
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.018611675 = queryNorm
              0.5363003 = fieldWeight in 2417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.09375 = fieldNorm(doc=2417)
          0.4612624 = weight(abstract_txt:file in 2417) [ClassicSimilarity], result of:
            0.4612624 = score(doc=2417,freq=4.0), product of:
              0.44096768 = queryWeight, product of:
                4.24699 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018611675 = queryNorm
              1.0460231 = fieldWeight in 2417, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.09375 = fieldNorm(doc=2417)
        0.2 = coord(5/25)
    
  3. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2014) 0.13
    0.13405538 = sum of:
      0.13405538 = product of:
        0.6702769 = sum of:
          0.041670714 = weight(abstract_txt:appropriate in 1970) [ClassicSimilarity], result of:
            0.041670714 = score(doc=1970,freq=1.0), product of:
              0.100257345 = queryWeight, product of:
                1.0125256 = boost
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.018611675 = queryNorm
              0.41563752 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.078125 = fieldNorm(doc=1970)
          0.041710716 = weight(abstract_txt:simple in 1970) [ClassicSimilarity], result of:
            0.041710716 = score(doc=1970,freq=1.0), product of:
              0.100321494 = queryWeight, product of:
                1.0128495 = boost
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.018611675 = queryNorm
              0.41577047 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.078125 = fieldNorm(doc=1970)
          0.0535307 = weight(abstract_txt:fast in 1970) [ClassicSimilarity], result of:
            0.0535307 = score(doc=1970,freq=1.0), product of:
              0.11847612 = queryWeight, product of:
                1.1006857 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.018611675 = queryNorm
              0.4518269 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.078125 = fieldNorm(doc=1970)
          0.10360888 = weight(abstract_txt:files in 1970) [ClassicSimilarity], result of:
            0.10360888 = score(doc=1970,freq=1.0), product of:
              0.2318303 = queryWeight, product of:
                2.177449 = boost
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.018611675 = queryNorm
              0.44691688 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.720536 = idf(docFreq=393, maxDocs=44218)
                0.078125 = fieldNorm(doc=1970)
          0.4297559 = weight(abstract_txt:file in 1970) [ClassicSimilarity], result of:
            0.4297559 = score(doc=1970,freq=5.0), product of:
              0.44096768 = queryWeight, product of:
                4.24699 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018611675 = queryNorm
              0.97457457 = fieldWeight in 1970, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.078125 = fieldNorm(doc=1970)
        0.2 = coord(5/25)
    
  4. Zamir, O.; Etzioni, O.: Grouper : a dynamic clustering interface to Web search results (1999) 0.13
    0.12539591 = sum of:
      0.12539591 = product of:
        0.522483 = sum of:
          0.041710716 = weight(abstract_txt:simple in 6207) [ClassicSimilarity], result of:
            0.041710716 = score(doc=6207,freq=1.0), product of:
              0.100321494 = queryWeight, product of:
                1.0128495 = boost
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.018611675 = queryNorm
              0.41577047 = fieldWeight in 6207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.321862 = idf(docFreq=586, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.0535307 = weight(abstract_txt:fast in 6207) [ClassicSimilarity], result of:
            0.0535307 = score(doc=6207,freq=1.0), product of:
              0.11847612 = queryWeight, product of:
                1.1006857 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.018611675 = queryNorm
              0.4518269 = fieldWeight in 6207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.13294494 = weight(abstract_txt:clustering in 6207) [ClassicSimilarity], result of:
            0.13294494 = score(doc=6207,freq=4.0), product of:
              0.13687478 = queryWeight, product of:
                1.1830678 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018611675 = queryNorm
              0.9712888 = fieldWeight in 6207, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.13255052 = weight(abstract_txt:clusters in 6207) [ClassicSimilarity], result of:
            0.13255052 = score(doc=6207,freq=3.0), product of:
              0.15035208 = queryWeight, product of:
                1.2399455 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.018611675 = queryNorm
              0.88160086 = fieldWeight in 6207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.13465436 = weight(abstract_txt:cluster in 6207) [ClassicSimilarity], result of:
            0.13465436 = score(doc=6207,freq=3.0), product of:
              0.15193883 = queryWeight, product of:
                1.2464713 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018611675 = queryNorm
              0.88624066 = fieldWeight in 6207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.027091717 = weight(abstract_txt:search in 6207) [ClassicSimilarity], result of:
            0.027091717 = score(doc=6207,freq=1.0), product of:
              0.09479743 = queryWeight, product of:
                1.392391 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018611675 = queryNorm
              0.28578535 = fieldWeight in 6207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
        0.24 = coord(6/25)
    
  5. Rasmussen, E.: Clustering algorithms (1992) 0.12
    0.118756175 = sum of:
      0.118756175 = product of:
        0.5937809 = sum of:
          0.076877065 = weight(abstract_txt:items in 3513) [ClassicSimilarity], result of:
            0.076877065 = score(doc=3513,freq=4.0), product of:
              0.11024192 = queryWeight, product of:
                1.0617476 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018611675 = queryNorm
              0.6973488 = fieldWeight in 3513, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.08658163 = weight(abstract_txt:clusters in 3513) [ClassicSimilarity], result of:
            0.08658163 = score(doc=3513,freq=2.0), product of:
              0.15035208 = queryWeight, product of:
                1.2399455 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.018611675 = queryNorm
              0.57585925 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.13907044 = weight(abstract_txt:cluster in 3513) [ClassicSimilarity], result of:
            0.13907044 = score(doc=3513,freq=5.0), product of:
              0.15193883 = queryWeight, product of:
                1.2464713 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018611675 = queryNorm
              0.9153055 = fieldWeight in 3513, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.07733508 = weight(abstract_txt:methods in 3513) [ClassicSimilarity], result of:
            0.07733508 = score(doc=3513,freq=6.0), product of:
              0.12181838 = queryWeight, product of:
                1.5784081 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.018611675 = queryNorm
              0.6348392 = fieldWeight in 3513, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.2139166 = weight(abstract_txt:clustered in 3513) [ClassicSimilarity], result of:
            0.2139166 = score(doc=3513,freq=1.0), product of:
              0.4361895 = queryWeight, product of:
                2.9867613 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.018611675 = queryNorm
              0.49042124 = fieldWeight in 3513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
        0.2 = coord(5/25)