Document (#33462)

Author
Golub, K.
Hamon, T.
Ardö, A.
Title
Automated classification of textual documents based on a controlled vocabulary in engineering
Source
Knowledge organization. 34(2007) no.4, S.247-263
Year
2007
Abstract
Automated subject classification has been a challenging research issue for many years now, receiving particular attention in the past decade due to rapid increase of digital documents. The most frequent approach to automated classification is machine learning. It, however, requires training documents and performs well on new documents only if these are similar enough to the former. We explore a string-matching algorithm based on a controlled vocabulary, which does not require training documents - instead it reuses the intellectual work put into creating the controlled vocabulary. Terms from the Engineering Information thesaurus and classification scheme were matched against title and abstract of engineering papers from the Compendex database. Simple string-matching was enhanced by several methods such as term weighting schemes and cut-offs, exclusion of certain terms, and en- richment of the controlled vocabulary with automatically extracted terms. The best results are 76% recall when the controlled vocabulary is enriched with new terms, and 79% precision when certain terms are excluded. Precision of individual classes is up to 98%. These results are comparable to state-of-the-art machine-learning algorithms.
Theme
Automatisches Klassifizieren
Field
Ingenieurwissenschaften

Similar documents (author)

  1. Koch, T.; Golub, K.; Ardö, A.: Users browsing behaviour in a DDC-based Web service : a log analysis (2006) 4.71
    4.7139244 = sum of:
      4.7139244 = sum of:
        2.0322585 = weight(author_txt:golub in 2234) [ClassicSimilarity], result of:
          2.0322585 = score(doc=2234,freq=1.0), product of:
            0.63922495 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.07539798 = queryNorm
            3.179254 = fieldWeight in 2234, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.375 = fieldNorm(doc=2234)
        2.681666 = weight(author_txt:ardö in 2234) [ClassicSimilarity], result of:
          2.681666 = score(doc=2234,freq=1.0), product of:
            0.7690198 = queryWeight, product of:
              1.0968366 = boost
              9.298992 = idf(docFreq=10, maxDocs=44218)
              0.07539798 = queryNorm
            3.487122 = fieldWeight in 2234, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.298992 = idf(docFreq=10, maxDocs=44218)
              0.375 = fieldNorm(doc=2234)
    
  2. Ardö, A.; Koch, T.: Lunds Universitets Elektroniska Bibliotek : Del.2: Gopher, World Wide Web (WWW). Planerade projekt (1993) 1.79
    1.7877772 = sum of:
      1.7877772 = product of:
        3.5755544 = sum of:
          3.5755544 = weight(author_txt:ardö in 6001) [ClassicSimilarity], result of:
            3.5755544 = score(doc=6001,freq=1.0), product of:
              0.7690198 = queryWeight, product of:
                1.0968366 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.07539798 = queryNorm
              4.649496 = fieldWeight in 6001, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.5 = fieldNorm(doc=6001)
        0.5 = coord(1/2)
    
  3. Ardö, A.; Koch, T.: Wide-area information server (WAIS) as the hub of an electronic library service at Lund University (1993) 1.79
    1.7877772 = sum of:
      1.7877772 = product of:
        3.5755544 = sum of:
          3.5755544 = weight(author_txt:ardö in 8459) [ClassicSimilarity], result of:
            3.5755544 = score(doc=8459,freq=1.0), product of:
              0.7690198 = queryWeight, product of:
                1.0968366 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.07539798 = queryNorm
              4.649496 = fieldWeight in 8459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.5 = fieldNorm(doc=8459)
        0.5 = coord(1/2)
    
  4. Ardö, A.; Koch, T.: Automatic classification applied to full-text Internet documents in a robot-generated subject index (1999) 1.79
    1.7877772 = sum of:
      1.7877772 = product of:
        3.5755544 = sum of:
          3.5755544 = weight(author_txt:ardö in 382) [ClassicSimilarity], result of:
            3.5755544 = score(doc=382,freq=1.0), product of:
              0.7690198 = queryWeight, product of:
                1.0968366 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.07539798 = queryNorm
              4.649496 = fieldWeight in 382, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.5 = fieldNorm(doc=382)
        0.5 = coord(1/2)
    
  5. Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 1.79
    1.7877772 = sum of:
      1.7877772 = product of:
        3.5755544 = sum of:
          3.5755544 = weight(author_txt:ardö in 1667) [ClassicSimilarity], result of:
            3.5755544 = score(doc=1667,freq=1.0), product of:
              0.7690198 = queryWeight, product of:
                1.0968366 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.07539798 = queryNorm
              4.649496 = fieldWeight in 1667, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.5 = fieldNorm(doc=1667)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.51
    0.5107906 = sum of:
      0.5107906 = product of:
        1.1608877 = sum of:
          0.013586603 = weight(abstract_txt:results in 4558) [ClassicSimilarity], result of:
            0.013586603 = score(doc=4558,freq=1.0), product of:
              0.062423695 = queryWeight, product of:
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.017925367 = queryNorm
              0.21765138 = fieldWeight in 4558, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.048786756 = weight(abstract_txt:learning in 4558) [ClassicSimilarity], result of:
            0.048786756 = score(doc=4558,freq=2.0), product of:
              0.11618057 = queryWeight, product of:
                1.3642439 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017925367 = queryNorm
              0.41992182 = fieldWeight in 4558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.04731554 = weight(abstract_txt:machine in 4558) [ClassicSimilarity], result of:
            0.04731554 = score(doc=4558,freq=1.0), product of:
              0.14342056 = queryWeight, product of:
                1.5157619 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017925367 = queryNorm
              0.32990766 = fieldWeight in 4558, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.1006464 = weight(abstract_txt:matching in 4558) [ClassicSimilarity], result of:
            0.1006464 = score(doc=4558,freq=2.0), product of:
              0.18827718 = queryWeight, product of:
                1.7366973 = boost
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.017925367 = queryNorm
              0.53456503 = fieldWeight in 4558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.16806035 = weight(abstract_txt:string in 4558) [ClassicSimilarity], result of:
            0.16806035 = score(doc=4558,freq=2.0), product of:
              0.26499787 = queryWeight, product of:
                2.060376 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.017925367 = queryNorm
              0.6341951 = fieldWeight in 4558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.040934812 = weight(abstract_txt:classification in 4558) [ClassicSimilarity], result of:
            0.040934812 = score(doc=4558,freq=1.0), product of:
              0.16406429 = queryWeight, product of:
                2.2927003 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017925367 = queryNorm
              0.2495047 = fieldWeight in 4558, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.14704633 = weight(abstract_txt:automated in 4558) [ClassicSimilarity], result of:
            0.14704633 = score(doc=4558,freq=3.0), product of:
              0.24241994 = queryWeight, product of:
                2.4135432 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.017925367 = queryNorm
              0.60657686 = fieldWeight in 4558, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.07521633 = weight(abstract_txt:terms in 4558) [ClassicSimilarity], result of:
            0.07521633 = score(doc=4558,freq=2.0), product of:
              0.21043612 = queryWeight, product of:
                2.9030561 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017925367 = queryNorm
              0.3574307 = fieldWeight in 4558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.079620555 = weight(abstract_txt:documents in 4558) [ClassicSimilarity], result of:
            0.079620555 = score(doc=4558,freq=2.0), product of:
              0.21857257 = queryWeight, product of:
                2.9586468 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017925367 = queryNorm
              0.36427513 = fieldWeight in 4558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.21431209 = weight(abstract_txt:vocabulary in 4558) [ClassicSimilarity], result of:
            0.21431209 = score(doc=4558,freq=3.0), product of:
              0.36947033 = queryWeight, product of:
                3.846671 = boost
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.017925367 = queryNorm
              0.58005226 = fieldWeight in 4558, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
          0.22536191 = weight(abstract_txt:controlled in 4558) [ClassicSimilarity], result of:
            0.22536191 = score(doc=4558,freq=3.0), product of:
              0.38206342 = queryWeight, product of:
                3.9116771 = boost
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.017925367 = queryNorm
              0.5898547 = fieldWeight in 4558, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.0625 = fieldNorm(doc=4558)
        0.44 = coord(11/25)
    
  2. Golub, K.: Automated subject classification of textual Web pages, based on a controlled vocabulary : challenges and recommendations (2006) 0.28
    0.2777641 = sum of:
      0.2777641 = product of:
        0.99201465 = sum of:
          0.08895969 = weight(abstract_txt:matching in 5897) [ClassicSimilarity], result of:
            0.08895969 = score(doc=5897,freq=1.0), product of:
              0.18827718 = queryWeight, product of:
                1.7366973 = boost
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.017925367 = queryNorm
              0.4724932 = fieldWeight in 5897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.21007544 = weight(abstract_txt:string in 5897) [ClassicSimilarity], result of:
            0.21007544 = score(doc=5897,freq=2.0), product of:
              0.26499787 = queryWeight, product of:
                2.060376 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.017925367 = queryNorm
              0.79274386 = fieldWeight in 5897, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.08862646 = weight(abstract_txt:classification in 5897) [ClassicSimilarity], result of:
            0.08862646 = score(doc=5897,freq=3.0), product of:
              0.16406429 = queryWeight, product of:
                2.2927003 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017925367 = queryNorm
              0.5401935 = fieldWeight in 5897, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.10612155 = weight(abstract_txt:automated in 5897) [ClassicSimilarity], result of:
            0.10612155 = score(doc=5897,freq=1.0), product of:
              0.24241994 = queryWeight, product of:
                2.4135432 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.017925367 = queryNorm
              0.43775916 = fieldWeight in 5897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.18092416 = weight(abstract_txt:engineering in 5897) [ClassicSimilarity], result of:
            0.18092416 = score(doc=5897,freq=2.0), product of:
              0.27459145 = queryWeight, product of:
                2.5687058 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.017925367 = queryNorm
              0.65888494 = fieldWeight in 5897, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.15466644 = weight(abstract_txt:vocabulary in 5897) [ClassicSimilarity], result of:
            0.15466644 = score(doc=5897,freq=1.0), product of:
              0.36947033 = queryWeight, product of:
                3.846671 = boost
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.017925367 = queryNorm
              0.41861665 = fieldWeight in 5897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
          0.16264094 = weight(abstract_txt:controlled in 5897) [ClassicSimilarity], result of:
            0.16264094 = score(doc=5897,freq=1.0), product of:
              0.38206342 = queryWeight, product of:
                3.9116771 = boost
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.017925367 = queryNorm
              0.42569098 = fieldWeight in 5897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.078125 = fieldNorm(doc=5897)
        0.28 = coord(7/25)
    
  3. Hubain, R.; Wilde, M. De; Hooland, S. van: Automated SKOS vocabulary design for the biopharmaceutical industry (2016) 0.22
    0.2185672 = sum of:
      0.2185672 = product of:
        0.9106966 = sum of:
          0.051746167 = weight(abstract_txt:learning in 5132) [ClassicSimilarity], result of:
            0.051746167 = score(doc=5132,freq=1.0), product of:
              0.11618057 = queryWeight, product of:
                1.3642439 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017925367 = queryNorm
              0.44539434 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.070973314 = weight(abstract_txt:machine in 5132) [ClassicSimilarity], result of:
            0.070973314 = score(doc=5132,freq=1.0), product of:
              0.14342056 = queryWeight, product of:
                1.5157619 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017925367 = queryNorm
              0.49486148 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.18009424 = weight(abstract_txt:automated in 5132) [ClassicSimilarity], result of:
            0.18009424 = score(doc=5132,freq=2.0), product of:
              0.24241994 = queryWeight, product of:
                2.4135432 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.017925367 = queryNorm
              0.7429019 = fieldWeight in 5132, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.14627229 = weight(abstract_txt:documents in 5132) [ClassicSimilarity], result of:
            0.14627229 = score(doc=5132,freq=3.0), product of:
              0.21857257 = queryWeight, product of:
                2.9586468 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017925367 = queryNorm
              0.6692161 = fieldWeight in 5132, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.18559971 = weight(abstract_txt:vocabulary in 5132) [ClassicSimilarity], result of:
            0.18559971 = score(doc=5132,freq=1.0), product of:
              0.36947033 = queryWeight, product of:
                3.846671 = boost
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.017925367 = queryNorm
              0.50233996 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.27601084 = weight(abstract_txt:controlled in 5132) [ClassicSimilarity], result of:
            0.27601084 = score(doc=5132,freq=2.0), product of:
              0.38206342 = queryWeight, product of:
                3.9116771 = boost
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.017925367 = queryNorm
              0.7224215 = fieldWeight in 5132, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
        0.24 = coord(6/25)
    
  4. Bhatia, S.K.; Deogun, J.S.; Raghavan, V.V.: Conceptual query formulation and retrieval (1995) 0.19
    0.19277266 = sum of:
      0.19277266 = product of:
        0.60241455 = sum of:
          0.034497447 = weight(abstract_txt:learning in 2607) [ClassicSimilarity], result of:
            0.034497447 = score(doc=2607,freq=1.0), product of:
              0.11618057 = queryWeight, product of:
                1.3642439 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017925367 = queryNorm
              0.29692957 = fieldWeight in 2607, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.06078251 = weight(abstract_txt:training in 2607) [ClassicSimilarity], result of:
            0.06078251 = score(doc=2607,freq=2.0), product of:
              0.13451931 = queryWeight, product of:
                1.4679713 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.017925367 = queryNorm
              0.4518497 = fieldWeight in 2607, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.04731554 = weight(abstract_txt:machine in 2607) [ClassicSimilarity], result of:
            0.04731554 = score(doc=2607,freq=1.0), product of:
              0.14342056 = queryWeight, product of:
                1.5157619 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017925367 = queryNorm
              0.32990766 = fieldWeight in 2607, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.05426346 = weight(abstract_txt:precision in 2607) [ClassicSimilarity], result of:
            0.05426346 = score(doc=2607,freq=1.0), product of:
              0.15713775 = queryWeight, product of:
                1.586593 = boost
                5.5251865 = idf(docFreq=478, maxDocs=44218)
                0.017925367 = queryNorm
              0.34532416 = fieldWeight in 2607, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5251865 = idf(docFreq=478, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.040934812 = weight(abstract_txt:classification in 2607) [ClassicSimilarity], result of:
            0.040934812 = score(doc=2607,freq=1.0), product of:
              0.16406429 = queryWeight, product of:
                2.2927003 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017925367 = queryNorm
              0.2495047 = fieldWeight in 2607, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.09212081 = weight(abstract_txt:terms in 2607) [ClassicSimilarity], result of:
            0.09212081 = score(doc=2607,freq=3.0), product of:
              0.21043612 = queryWeight, product of:
                2.9030561 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017925367 = queryNorm
              0.4377614 = fieldWeight in 2607, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.09751486 = weight(abstract_txt:documents in 2607) [ClassicSimilarity], result of:
            0.09751486 = score(doc=2607,freq=3.0), product of:
              0.21857257 = queryWeight, product of:
                2.9586468 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017925367 = queryNorm
              0.44614407 = fieldWeight in 2607, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
          0.1749851 = weight(abstract_txt:vocabulary in 2607) [ClassicSimilarity], result of:
            0.1749851 = score(doc=2607,freq=2.0), product of:
              0.36947033 = queryWeight, product of:
                3.846671 = boost
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.017925367 = queryNorm
              0.47361067 = fieldWeight in 2607, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.0625 = fieldNorm(doc=2607)
        0.32 = coord(8/25)
    
  5. Dumais, S.T.: Latent semantic analysis (2003) 0.18
    0.18333799 = sum of:
      0.18333799 = product of:
        0.57293123 = sum of:
          0.01148304 = weight(abstract_txt:when in 2462) [ClassicSimilarity], result of:
            0.01148304 = score(doc=2462,freq=1.0), product of:
              0.08857954 = queryWeight, product of:
                1.19122 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.017925367 = queryNorm
              0.12963535 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.017248724 = weight(abstract_txt:learning in 2462) [ClassicSimilarity], result of:
            0.017248724 = score(doc=2462,freq=1.0), product of:
              0.11618057 = queryWeight, product of:
                1.3642439 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.017925367 = queryNorm
              0.14846478 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.02365777 = weight(abstract_txt:machine in 2462) [ClassicSimilarity], result of:
            0.02365777 = score(doc=2462,freq=1.0), product of:
              0.14342056 = queryWeight, product of:
                1.5157619 = boost
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.017925367 = queryNorm
              0.16495383 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2785225 = idf(docFreq=612, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.035583876 = weight(abstract_txt:matching in 2462) [ClassicSimilarity], result of:
            0.035583876 = score(doc=2462,freq=1.0), product of:
              0.18827718 = queryWeight, product of:
                1.7366973 = boost
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.017925367 = queryNorm
              0.18899728 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.047913 = idf(docFreq=283, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.08819896 = weight(abstract_txt:terms in 2462) [ClassicSimilarity], result of:
            0.08819896 = score(doc=2462,freq=11.0), product of:
              0.21043612 = queryWeight, product of:
                2.9030561 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017925367 = queryNorm
              0.41912463 = fieldWeight in 2462, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.08901849 = weight(abstract_txt:documents in 2462) [ClassicSimilarity], result of:
            0.08901849 = score(doc=2462,freq=10.0), product of:
              0.21857257 = queryWeight, product of:
                2.9586468 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017925367 = queryNorm
              0.40727198 = fieldWeight in 2462, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.12373314 = weight(abstract_txt:vocabulary in 2462) [ClassicSimilarity], result of:
            0.12373314 = score(doc=2462,freq=4.0), product of:
              0.36947033 = queryWeight, product of:
                3.846671 = boost
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.017925367 = queryNorm
              0.33489332 = fieldWeight in 2462, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.358293 = idf(docFreq=565, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.18400723 = weight(abstract_txt:controlled in 2462) [ClassicSimilarity], result of:
            0.18400723 = score(doc=2462,freq=8.0), product of:
              0.38206342 = queryWeight, product of:
                3.9116771 = boost
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.017925367 = queryNorm
              0.48161435 = fieldWeight in 2462, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.4488444 = idf(docFreq=516, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
        0.32 = coord(8/25)