Document (#28647)

Author
Needham, R.M.
Sparck Jones, K.
Title
Keywords and clumps
Source
Theory of subject analysis: a sourcebook. Ed.: L.M. Chan, et al
Imprint
Littleton, CO : Libraries Unlimited
Year
1985
Pages
S.262-272
Abstract
The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.
Footnote
Nachdruck des Originalartikels mit Kommentierung durch die Herausgeber
Original in: Journal of documentation 20(1964) no.1, S.5-15.
Theme
Computerlinguistik
Automatisches Indexieren

Similar documents (author)

  1. Sparck Jones, K.: Fashionable trends and feasible strategies in information management (1988) 5.31
    5.3121734 = sum of:
      5.3121734 = sum of:
        2.0592055 = weight(author_txt:jones in 817) [ClassicSimilarity], result of:
          2.0592055 = score(doc=817,freq=1.0), product of:
            0.5934105 = queryWeight, product of:
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.085502885 = queryNorm
            3.4701197 = fieldWeight in 817, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.5 = fieldNorm(doc=817)
        3.2529678 = weight(author_txt:sparck in 817) [ClassicSimilarity], result of:
          3.2529678 = score(doc=817,freq=1.0), product of:
            0.80490005 = queryWeight, product of:
              1.1646445 = boost
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.085502885 = queryNorm
            4.0414557 = fieldWeight in 817, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.5 = fieldNorm(doc=817)
    
  2. Sparck Jones, K.: Automatic classification (1976) 5.31
    5.3121734 = sum of:
      5.3121734 = sum of:
        2.0592055 = weight(author_txt:jones in 2908) [ClassicSimilarity], result of:
          2.0592055 = score(doc=2908,freq=1.0), product of:
            0.5934105 = queryWeight, product of:
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.085502885 = queryNorm
            3.4701197 = fieldWeight in 2908, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.5 = fieldNorm(doc=2908)
        3.2529678 = weight(author_txt:sparck in 2908) [ClassicSimilarity], result of:
          3.2529678 = score(doc=2908,freq=1.0), product of:
            0.80490005 = queryWeight, product of:
              1.1646445 = boost
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.085502885 = queryNorm
            4.0414557 = fieldWeight in 2908, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.5 = fieldNorm(doc=2908)
    
  3. Sparck Jones, K.: ¬The role of artificial intelligence in information retrieval (1991) 5.31
    5.3121734 = sum of:
      5.3121734 = sum of:
        2.0592055 = weight(author_txt:jones in 4811) [ClassicSimilarity], result of:
          2.0592055 = score(doc=4811,freq=1.0), product of:
            0.5934105 = queryWeight, product of:
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.085502885 = queryNorm
            3.4701197 = fieldWeight in 4811, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.5 = fieldNorm(doc=4811)
        3.2529678 = weight(author_txt:sparck in 4811) [ClassicSimilarity], result of:
          3.2529678 = score(doc=4811,freq=1.0), product of:
            0.80490005 = queryWeight, product of:
              1.1646445 = boost
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.085502885 = queryNorm
            4.0414557 = fieldWeight in 4811, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.5 = fieldNorm(doc=4811)
    
  4. Sparck Jones, K.: Automatic keyword classification for information retrieval (1971) 5.31
    5.3121734 = sum of:
      5.3121734 = sum of:
        2.0592055 = weight(author_txt:jones in 5176) [ClassicSimilarity], result of:
          2.0592055 = score(doc=5176,freq=1.0), product of:
            0.5934105 = queryWeight, product of:
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.085502885 = queryNorm
            3.4701197 = fieldWeight in 5176, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.5 = fieldNorm(doc=5176)
        3.2529678 = weight(author_txt:sparck in 5176) [ClassicSimilarity], result of:
          3.2529678 = score(doc=5176,freq=1.0), product of:
            0.80490005 = queryWeight, product of:
              1.1646445 = boost
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.085502885 = queryNorm
            4.0414557 = fieldWeight in 5176, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.5 = fieldNorm(doc=5176)
    
  5. Sparck Jones, K.: ¬A statistical interpretation of term specifity and its application in retrieval (1972) 5.31
    5.3121734 = sum of:
      5.3121734 = sum of:
        2.0592055 = weight(author_txt:jones in 5187) [ClassicSimilarity], result of:
          2.0592055 = score(doc=5187,freq=1.0), product of:
            0.5934105 = queryWeight, product of:
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.085502885 = queryNorm
            3.4701197 = fieldWeight in 5187, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.9402394 = idf(docFreq=115, maxDocs=44083)
              0.5 = fieldNorm(doc=5187)
        3.2529678 = weight(author_txt:sparck in 5187) [ClassicSimilarity], result of:
          3.2529678 = score(doc=5187,freq=1.0), product of:
            0.80490005 = queryWeight, product of:
              1.1646445 = boost
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.085502885 = queryNorm
            4.0414557 = fieldWeight in 5187, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.0829115 = idf(docFreq=36, maxDocs=44083)
              0.5 = fieldNorm(doc=5187)
    

Similar documents (content)

  1. Borko, H.: Research in computer based classification systems (1985) 0.78
    0.7834767 = sum of:
      0.7834767 = product of:
        1.1521716 = sum of:
          0.020634053 = weight(abstract_txt:documents in 4648) [ClassicSimilarity], result of:
            0.020634053 = score(doc=4648,freq=3.0), product of:
              0.074008405 = queryWeight, product of:
                1.0065705 = boost
                4.1208124 = idf(docFreq=1944, maxDocs=44083)
                0.01784243 = queryNorm
              0.2788069 = fieldWeight in 4648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1208124 = idf(docFreq=1944, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.007264049 = weight(abstract_txt:research in 4648) [ClassicSimilarity], result of:
            0.007264049 = score(doc=4648,freq=1.0), product of:
              0.05857296 = queryWeight, product of:
                1.0340025 = boost
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.01784243 = queryNorm
              0.124017105 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.009153684 = weight(abstract_txt:such in 4648) [ClassicSimilarity], result of:
            0.009153684 = score(doc=4648,freq=1.0), product of:
              0.0683348 = queryWeight, product of:
                1.1168479 = boost
                3.4292088 = idf(docFreq=3883, maxDocs=44083)
                0.01784243 = queryNorm
              0.13395347 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4292088 = idf(docFreq=3883, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.027360432 = weight(abstract_txt:whether in 4648) [ClassicSimilarity], result of:
            0.027360432 = score(doc=4648,freq=2.0), product of:
              0.10225167 = queryWeight, product of:
                1.1831474 = boost
                4.8437033 = idf(docFreq=943, maxDocs=44083)
                0.01784243 = queryNorm
              0.26757932 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8437033 = idf(docFreq=943, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.04160877 = weight(abstract_txt:question in 4648) [ClassicSimilarity], result of:
            0.04160877 = score(doc=4648,freq=3.0), product of:
              0.11812666 = queryWeight, product of:
                1.2716794 = boost
                5.2061453 = idf(docFreq=656, maxDocs=44083)
                0.01784243 = queryNorm
              0.3522386 = fieldWeight in 4648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.2061453 = idf(docFreq=656, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.057141434 = weight(abstract_txt:automatically in 4648) [ClassicSimilarity], result of:
            0.057141434 = score(doc=4648,freq=4.0), product of:
              0.13260072 = queryWeight, product of:
                1.3473382 = boost
                5.5158854 = idf(docFreq=481, maxDocs=44083)
                0.01784243 = queryNorm
              0.43092853 = fieldWeight in 4648, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5158854 = idf(docFreq=481, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.016835207 = weight(abstract_txt:retrieval in 4648) [ClassicSimilarity], result of:
            0.016835207 = score(doc=4648,freq=2.0), product of:
              0.08770352 = queryWeight, product of:
                1.4146094 = boost
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.01784243 = queryNorm
              0.1919559 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.026013592 = weight(abstract_txt:they in 4648) [ClassicSimilarity], result of:
            0.026013592 = score(doc=4648,freq=3.0), product of:
              0.10240186 = queryWeight, product of:
                1.5285581 = boost
                3.7546706 = idf(docFreq=2804, maxDocs=44083)
                0.01784243 = queryNorm
              0.25403437 = fieldWeight in 4648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7546706 = idf(docFreq=2804, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.11019506 = weight(abstract_txt:clumping in 4648) [ClassicSimilarity], result of:
            0.11019506 = score(doc=4648,freq=1.0), product of:
              0.28488928 = queryWeight, product of:
                1.6124865 = boost
                9.90207 = idf(docFreq=5, maxDocs=44083)
                0.01784243 = queryNorm
              0.3867996 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.90207 = idf(docFreq=5, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.06018229 = weight(abstract_txt:classification in 4648) [ClassicSimilarity], result of:
            0.06018229 = score(doc=4648,freq=11.0), product of:
              0.11616169 = queryWeight, product of:
                1.6280191 = boost
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.01784243 = queryNorm
              0.51809067 = fieldWeight in 4648, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.032644294 = weight(abstract_txt:terms in 4648) [ClassicSimilarity], result of:
            0.032644294 = score(doc=4648,freq=3.0), product of:
              0.11913676 = queryWeight, product of:
                1.6487353 = boost
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.01784243 = queryNorm
              0.2740069 = fieldWeight in 4648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.045578066 = weight(abstract_txt:similar in 4648) [ClassicSimilarity], result of:
            0.045578066 = score(doc=4648,freq=2.0), product of:
              0.158151 = queryWeight, product of:
                1.6990612 = boost
                5.216857 = idf(docFreq=649, maxDocs=44083)
                0.01784243 = queryNorm
              0.28819335 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.216857 = idf(docFreq=649, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.15491343 = weight(abstract_txt:jones in 4648) [ClassicSimilarity], result of:
            0.15491343 = score(doc=4648,freq=2.0), product of:
              0.35751483 = queryWeight, product of:
                2.5545833 = boost
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.01784243 = queryNorm
              0.43330628 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.036789622 = weight(abstract_txt:that in 4648) [ClassicSimilarity], result of:
            0.036789622 = score(doc=4648,freq=12.0), product of:
              0.11455759 = queryWeight, product of:
                2.705322 = boost
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.01784243 = queryNorm
              0.32114524 = fieldWeight in 4648, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.1246835 = weight(abstract_txt:automatic in 4648) [ClassicSimilarity], result of:
            0.1246835 = score(doc=4648,freq=5.0), product of:
              0.2746671 = queryWeight, product of:
                2.9620705 = boost
                5.1970544 = idf(docFreq=662, maxDocs=44083)
                0.01784243 = queryNorm
              0.45394403 = fieldWeight in 4648, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1970544 = idf(docFreq=662, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.24422027 = weight(abstract_txt:sparck in 4648) [ClassicSimilarity], result of:
            0.24422027 = score(doc=4648,freq=2.0), product of:
              0.48427176 = queryWeight, product of:
                2.9731555 = boost
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.01784243 = queryNorm
              0.5043042 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
          0.13695383 = weight(abstract_txt:classes in 4648) [ClassicSimilarity], result of:
            0.13695383 = score(doc=4648,freq=3.0), product of:
              0.3466834 = queryWeight, product of:
                3.3278105 = boost
                5.8387575 = idf(docFreq=348, maxDocs=44083)
                0.01784243 = queryNorm
              0.39504004 = fieldWeight in 4648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8387575 = idf(docFreq=348, maxDocs=44083)
                0.0390625 = fieldNorm(doc=4648)
        0.68 = coord(17/25)
    
  2. Hjoerland, B.; Pedersen, K.N.: ¬A substantive theory of classification for information retrieval (2005) 0.22
    0.21560197 = sum of:
      0.21560197 = product of:
        0.770007 = sum of:
          0.011622478 = weight(abstract_txt:research in 2893) [ClassicSimilarity], result of:
            0.011622478 = score(doc=2893,freq=1.0), product of:
              0.05857296 = queryWeight, product of:
                1.0340025 = boost
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.01784243 = queryNorm
              0.19842736 = fieldWeight in 2893, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.032990135 = weight(abstract_txt:retrieval in 2893) [ClassicSimilarity], result of:
            0.032990135 = score(doc=2893,freq=3.0), product of:
              0.08770352 = queryWeight, product of:
                1.4146094 = boost
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.01784243 = queryNorm
              0.3761552 = fieldWeight in 2893, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.058066055 = weight(abstract_txt:classification in 2893) [ClassicSimilarity], result of:
            0.058066055 = score(doc=2893,freq=4.0), product of:
              0.11616169 = queryWeight, product of:
                1.6280191 = boost
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.01784243 = queryNorm
              0.49987268 = fieldWeight in 2893, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.24786147 = weight(abstract_txt:jones in 2893) [ClassicSimilarity], result of:
            0.24786147 = score(doc=2893,freq=2.0), product of:
              0.35751483 = queryWeight, product of:
                2.5545833 = boost
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.01784243 = queryNorm
              0.69329005 = fieldWeight in 2893, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.016992401 = weight(abstract_txt:that in 2893) [ClassicSimilarity], result of:
            0.016992401 = score(doc=2893,freq=1.0), product of:
              0.11455759 = queryWeight, product of:
                2.705322 = boost
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.01784243 = queryNorm
              0.14833064 = fieldWeight in 2893, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.12617083 = weight(abstract_txt:automatic in 2893) [ClassicSimilarity], result of:
            0.12617083 = score(doc=2893,freq=2.0), product of:
              0.2746671 = queryWeight, product of:
                2.9620705 = boost
                5.1970544 = idf(docFreq=662, maxDocs=44083)
                0.01784243 = queryNorm
              0.45935905 = fieldWeight in 2893, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1970544 = idf(docFreq=662, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
          0.27630368 = weight(abstract_txt:sparck in 2893) [ClassicSimilarity], result of:
            0.27630368 = score(doc=2893,freq=1.0), product of:
              0.48427176 = queryWeight, product of:
                2.9731555 = boost
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.01784243 = queryNorm
              0.57055503 = fieldWeight in 2893, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.0625 = fieldNorm(doc=2893)
        0.28 = coord(7/25)
    
  3. Robertson, M.; Willett, P.: ¬An upperbound to the performance of ranked output searching : optimal weighting of query terms using a genetic algorithms (1996) 0.21
    0.2071052 = sum of:
      0.2071052 = product of:
        1.035526 = sum of:
          0.038093727 = weight(abstract_txt:retrieval in 47) [ClassicSimilarity], result of:
            0.038093727 = score(doc=47,freq=1.0), product of:
              0.08770352 = queryWeight, product of:
                1.4146094 = boost
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.01784243 = queryNorm
              0.43434662 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.474773 = idf(docFreq=3710, maxDocs=44083)
                0.125 = fieldNorm(doc=47)
          0.060311012 = weight(abstract_txt:terms in 47) [ClassicSimilarity], result of:
            0.060311012 = score(doc=47,freq=1.0), product of:
              0.11913676 = queryWeight, product of:
                1.6487353 = boost
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.01784243 = queryNorm
              0.50623345 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.125 = fieldNorm(doc=47)
          0.35052907 = weight(abstract_txt:jones in 47) [ClassicSimilarity], result of:
            0.35052907 = score(doc=47,freq=1.0), product of:
              0.35751483 = queryWeight, product of:
                2.5545833 = boost
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.01784243 = queryNorm
              0.9804602 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.125 = fieldNorm(doc=47)
          0.033984803 = weight(abstract_txt:that in 47) [ClassicSimilarity], result of:
            0.033984803 = score(doc=47,freq=1.0), product of:
              0.11455759 = queryWeight, product of:
                2.705322 = boost
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.01784243 = queryNorm
              0.2966613 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.125 = fieldNorm(doc=47)
          0.55260736 = weight(abstract_txt:sparck in 47) [ClassicSimilarity], result of:
            0.55260736 = score(doc=47,freq=1.0), product of:
              0.48427176 = queryWeight, product of:
                2.9731555 = boost
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.01784243 = queryNorm
              1.1411101 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.125 = fieldNorm(doc=47)
        0.2 = coord(5/25)
    
  4. Robertson, S.E.: On relevance weight estimation and query expansion (1986) 0.20
    0.20031387 = sum of:
      0.20031387 = product of:
        1.2519617 = sum of:
          0.047652304 = weight(abstract_txt:documents in 3875) [ClassicSimilarity], result of:
            0.047652304 = score(doc=3875,freq=1.0), product of:
              0.074008405 = queryWeight, product of:
                1.0065705 = boost
                4.1208124 = idf(docFreq=1944, maxDocs=44083)
                0.01784243 = queryNorm
              0.6438769 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1208124 = idf(docFreq=1944, maxDocs=44083)
                0.15625 = fieldNorm(doc=3875)
          0.07538877 = weight(abstract_txt:terms in 3875) [ClassicSimilarity], result of:
            0.07538877 = score(doc=3875,freq=1.0), product of:
              0.11913676 = queryWeight, product of:
                1.6487353 = boost
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.01784243 = queryNorm
              0.6327918 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.15625 = fieldNorm(doc=3875)
          0.43816134 = weight(abstract_txt:jones in 3875) [ClassicSimilarity], result of:
            0.43816134 = score(doc=3875,freq=1.0), product of:
              0.35751483 = queryWeight, product of:
                2.5545833 = boost
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.01784243 = queryNorm
              1.2255753 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.843682 = idf(docFreq=46, maxDocs=44083)
                0.15625 = fieldNorm(doc=3875)
          0.69075924 = weight(abstract_txt:sparck in 3875) [ClassicSimilarity], result of:
            0.69075924 = score(doc=3875,freq=1.0), product of:
              0.48427176 = queryWeight, product of:
                2.9731555 = boost
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.01784243 = queryNorm
              1.4263875 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1288805 = idf(docFreq=12, maxDocs=44083)
                0.15625 = fieldNorm(doc=3875)
        0.16 = coord(4/25)
    
  5. Sjögårde, P.; Ahlgren, P.; Waltman, L.: Algorithmic labeling in hierarchical classifications of publications : evaluation of bibliographic fields and term weighting approaches (2021) 0.17
    0.16727155 = sum of:
      0.16727155 = product of:
        0.5227236 = sum of:
          0.07283741 = weight(abstract_txt:classifications in 481) [ClassicSimilarity], result of:
            0.07283741 = score(doc=481,freq=3.0), product of:
              0.10956805 = queryWeight, product of:
                6.14087 = idf(docFreq=257, maxDocs=44083)
                0.01784243 = queryNorm
              0.6647687 = fieldWeight in 481, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.14087 = idf(docFreq=257, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.016436666 = weight(abstract_txt:research in 481) [ClassicSimilarity], result of:
            0.016436666 = score(doc=481,freq=2.0), product of:
              0.05857296 = queryWeight, product of:
                1.0340025 = boost
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.01784243 = queryNorm
              0.28061867 = fieldWeight in 481, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1748378 = idf(docFreq=5008, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.014645894 = weight(abstract_txt:such in 481) [ClassicSimilarity], result of:
            0.014645894 = score(doc=481,freq=1.0), product of:
              0.0683348 = queryWeight, product of:
                1.1168479 = boost
                3.4292088 = idf(docFreq=3883, maxDocs=44083)
                0.01784243 = queryNorm
              0.21432555 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4292088 = idf(docFreq=3883, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.05944214 = weight(abstract_txt:keywords in 481) [ClassicSimilarity], result of:
            0.05944214 = score(doc=481,freq=1.0), product of:
              0.15797247 = queryWeight, product of:
                1.4705994 = boost
                6.0205064 = idf(docFreq=290, maxDocs=44083)
                0.01784243 = queryNorm
              0.37628165 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0205064 = idf(docFreq=290, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.029033028 = weight(abstract_txt:classification in 481) [ClassicSimilarity], result of:
            0.029033028 = score(doc=481,freq=1.0), product of:
              0.11616169 = queryWeight, product of:
                1.6280191 = boost
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.01784243 = queryNorm
              0.24993634 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9989815 = idf(docFreq=2196, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.060311012 = weight(abstract_txt:terms in 481) [ClassicSimilarity], result of:
            0.060311012 = score(doc=481,freq=4.0), product of:
              0.11913676 = queryWeight, product of:
                1.6487353 = boost
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.01784243 = queryNorm
              0.50623345 = fieldWeight in 481, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0498676 = idf(docFreq=2087, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.016992401 = weight(abstract_txt:that in 481) [ClassicSimilarity], result of:
            0.016992401 = score(doc=481,freq=1.0), product of:
              0.11455759 = queryWeight, product of:
                2.705322 = boost
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.01784243 = queryNorm
              0.14833064 = fieldWeight in 481, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3732903 = idf(docFreq=11164, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
          0.25302505 = weight(abstract_txt:classes in 481) [ClassicSimilarity], result of:
            0.25302505 = score(doc=481,freq=4.0), product of:
              0.3466834 = queryWeight, product of:
                3.3278105 = boost
                5.8387575 = idf(docFreq=348, maxDocs=44083)
                0.01784243 = queryNorm
              0.7298447 = fieldWeight in 481, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8387575 = idf(docFreq=348, maxDocs=44083)
                0.0625 = fieldNorm(doc=481)
        0.32 = coord(8/25)