Document (#33049)

Author
Lim, C.S.
Lee, K.J.
Kim, G.C.
Title
Multiple sets of features for automatic genre classification of web documents
Source
Information processing and management. 41(2005) no.5, S.1263-1276
Year
2005
Abstract
With the increase of information on the Web, it is difficult to find desired information quickly out of the documents retrieved by a search engine. One way to solve this problem is to classify web documents according to various criteria. Most document classification has been focused on a subject or a topic of a document. A genre or a style is another view of a document different from a subject or a topic. The genre is also a criterion to classify documents. In this paper, we suggest multiple sets of features to classify genres of web documents. The basic set of features, which have been proposed in the previous studies, is acquired from the textual properties of documents, such as the number of sentences, the number of a certain word, etc. However, web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce new sets of features specific to web documents, which are extracted from URL and HTML tags. The present work is an attempt to evaluate the performance of the proposed sets of features, and to discuss their characteristics. Finally, we conclude which is an appropriate set of features in automatic genre classification of web documents.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Finn, A.; Kushmerick, N.: Learning to classify documents according to genre (2006) 0.69
    0.68717384 = sum of:
      0.68717384 = product of:
        1.4316123 = sum of:
          0.027583236 = weight(abstract_txt:different in 6010) [ClassicSimilarity], result of:
            0.027583236 = score(doc=6010,freq=2.0), product of:
              0.068109356 = queryWeight, product of:
                1.047824 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.017733112 = queryNorm
              0.40498453 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.027952168 = weight(abstract_txt:number in 6010) [ClassicSimilarity], result of:
            0.027952168 = score(doc=6010,freq=1.0), product of:
              0.08657589 = queryWeight, product of:
                1.1813632 = boost
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.017733112 = queryNorm
              0.3228632 = fieldWeight in 6010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.020845773 = weight(abstract_txt:which in 6010) [ClassicSimilarity], result of:
            0.020845773 = score(doc=6010,freq=2.0), product of:
              0.064687304 = queryWeight, product of:
                1.2506626 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.017733112 = queryNorm
              0.32225448 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.07265686 = weight(abstract_txt:topic in 6010) [ClassicSimilarity], result of:
            0.07265686 = score(doc=6010,freq=2.0), product of:
              0.12990555 = queryWeight, product of:
                1.447101 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.017733112 = queryNorm
              0.5593053 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.075935096 = weight(abstract_txt:multiple in 6010) [ClassicSimilarity], result of:
            0.075935096 = score(doc=6010,freq=2.0), product of:
              0.13378425 = queryWeight, product of:
                1.4685458 = boost
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.017733112 = queryNorm
              0.5675937 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.07855095 = weight(abstract_txt:automatic in 6010) [ClassicSimilarity], result of:
            0.07855095 = score(doc=6010,freq=2.0), product of:
              0.13683932 = queryWeight, product of:
                1.4852188 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017733112 = queryNorm
              0.57403785 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.053448103 = weight(abstract_txt:classification in 6010) [ClassicSimilarity], result of:
            0.053448103 = score(doc=6010,freq=2.0), product of:
              0.121179335 = queryWeight, product of:
                1.711768 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017733112 = queryNorm
              0.44106615 = fieldWeight in 6010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.046987783 = weight(abstract_txt:document in 6010) [ClassicSimilarity], result of:
            0.046987783 = score(doc=6010,freq=1.0), product of:
              0.14011146 = queryWeight, product of:
                1.840634 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017733112 = queryNorm
              0.33536002 = fieldWeight in 6010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.110418506 = weight(abstract_txt:sets in 6010) [ClassicSimilarity], result of:
            0.110418506 = score(doc=6010,freq=1.0), product of:
              0.27257824 = queryWeight, product of:
                2.96446 = boost
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.017733112 = queryNorm
              0.40508923 = fieldWeight in 6010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.46658736 = weight(abstract_txt:genre in 6010) [ClassicSimilarity], result of:
            0.46658736 = score(doc=6010,freq=4.0), product of:
              0.4488128 = queryWeight, product of:
                3.8039308 = boost
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.017733112 = queryNorm
              1.0396035 = fieldWeight in 6010, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.11111565 = weight(abstract_txt:features in 6010) [ClassicSimilarity], result of:
            0.11111565 = score(doc=6010,freq=1.0), product of:
              0.31333613 = queryWeight, product of:
                3.8926992 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.017733112 = queryNorm
              0.35462123 = fieldWeight in 6010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
          0.33953074 = weight(abstract_txt:documents in 6010) [ClassicSimilarity], result of:
            0.33953074 = score(doc=6010,freq=6.0), product of:
              0.4305057 = queryWeight, product of:
                5.890599 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017733112 = queryNorm
              0.7886788 = fieldWeight in 6010, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=6010)
        0.48 = coord(12/25)
    
  2. Ko, Y.; Park, J.; Seo, J.: Improving text categorization using the importance of sentences (2004) 0.31
    0.30579755 = sum of:
      0.30579755 = product of:
        0.8494376 = sum of:
          0.10850383 = weight(abstract_txt:sentences in 2557) [ClassicSimilarity], result of:
            0.10850383 = score(doc=2557,freq=4.0), product of:
              0.12406807 = queryWeight, product of:
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.017733112 = queryNorm
              0.8745509 = fieldWeight in 2557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.015603435 = weight(abstract_txt:different in 2557) [ClassicSimilarity], result of:
            0.015603435 = score(doc=2557,freq=1.0), product of:
              0.068109356 = queryWeight, product of:
                1.047824 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.017733112 = queryNorm
              0.22909386 = fieldWeight in 2557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.044435125 = weight(abstract_txt:automatic in 2557) [ClassicSimilarity], result of:
            0.044435125 = score(doc=2557,freq=1.0), product of:
              0.13683932 = queryWeight, product of:
                1.4852188 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017733112 = queryNorm
              0.32472485 = fieldWeight in 2557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.01337854 = weight(abstract_txt:from in 2557) [ClassicSimilarity], result of:
            0.01337854 = score(doc=2557,freq=1.0), product of:
              0.0774478 = queryWeight, product of:
                1.5801725 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017733112 = queryNorm
              0.17274266 = fieldWeight in 2557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.07518045 = weight(abstract_txt:document in 2557) [ClassicSimilarity], result of:
            0.07518045 = score(doc=2557,freq=4.0), product of:
              0.14011146 = queryWeight, product of:
                1.840634 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017733112 = queryNorm
              0.53657603 = fieldWeight in 2557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.13280436 = weight(abstract_txt:classify in 2557) [ClassicSimilarity], result of:
            0.13280436 = score(doc=2557,freq=1.0), product of:
              0.3250114 = queryWeight, product of:
                2.8033667 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.017733112 = queryNorm
              0.4086145 = fieldWeight in 2557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.12492428 = weight(abstract_txt:sets in 2557) [ClassicSimilarity], result of:
            0.12492428 = score(doc=2557,freq=2.0), product of:
              0.27257824 = queryWeight, product of:
                2.96446 = boost
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.017733112 = queryNorm
              0.45830613 = fieldWeight in 2557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.17778502 = weight(abstract_txt:features in 2557) [ClassicSimilarity], result of:
            0.17778502 = score(doc=2557,freq=4.0), product of:
              0.31333613 = queryWeight, product of:
                3.8926992 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.017733112 = queryNorm
              0.56739396 = fieldWeight in 2557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
          0.15682252 = weight(abstract_txt:documents in 2557) [ClassicSimilarity], result of:
            0.15682252 = score(doc=2557,freq=2.0), product of:
              0.4305057 = queryWeight, product of:
                5.890599 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017733112 = queryNorm
              0.36427513 = fieldWeight in 2557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2557)
        0.36 = coord(9/25)
    
  3. Fairthorne, R.A.: Temporal structure in bibliographic classification (1985) 0.30
    0.2952651 = sum of:
      0.2952651 = product of:
        0.67105705 = sum of:
          0.016891215 = weight(abstract_txt:different in 3651) [ClassicSimilarity], result of:
            0.016891215 = score(doc=3651,freq=3.0), product of:
              0.068109356 = queryWeight, product of:
                1.047824 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.017733112 = queryNorm
              0.24800138 = fieldWeight in 3651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.020454884 = weight(abstract_txt:subject in 3651) [ClassicSimilarity], result of:
            0.020454884 = score(doc=3651,freq=3.0), product of:
              0.07738038 = queryWeight, product of:
                1.1168643 = boost
                3.9070187 = idf(docFreq=2415, maxDocs=44218)
                0.017733112 = queryNorm
              0.26434198 = fieldWeight in 3651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9070187 = idf(docFreq=2415, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.013976084 = weight(abstract_txt:number in 3651) [ClassicSimilarity], result of:
            0.013976084 = score(doc=3651,freq=1.0), product of:
              0.08657589 = queryWeight, product of:
                1.1813632 = boost
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.017733112 = queryNorm
              0.1614316 = fieldWeight in 3651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.018052971 = weight(abstract_txt:which in 3651) [ClassicSimilarity], result of:
            0.018052971 = score(doc=3651,freq=6.0), product of:
              0.064687304 = queryWeight, product of:
                1.2506626 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.017733112 = queryNorm
              0.2790806 = fieldWeight in 3651, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.014482695 = weight(abstract_txt:from in 3651) [ClassicSimilarity], result of:
            0.014482695 = score(doc=3651,freq=3.0), product of:
              0.0774478 = queryWeight, product of:
                1.5801725 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017733112 = queryNorm
              0.18699943 = fieldWeight in 3651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.042133886 = weight(abstract_txt:textual in 3651) [ClassicSimilarity], result of:
            0.042133886 = score(doc=3651,freq=1.0), product of:
              0.18067344 = queryWeight, product of:
                1.7066015 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.017733112 = queryNorm
              0.23320466 = fieldWeight in 3651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.049996123 = weight(abstract_txt:classification in 3651) [ClassicSimilarity], result of:
            0.049996123 = score(doc=3651,freq=7.0), product of:
              0.121179335 = queryWeight, product of:
                1.711768 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017733112 = queryNorm
              0.4125796 = fieldWeight in 3651, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.03322538 = weight(abstract_txt:document in 3651) [ClassicSimilarity], result of:
            0.03322538 = score(doc=3651,freq=2.0), product of:
              0.14011146 = queryWeight, product of:
                1.840634 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017733112 = queryNorm
              0.23713535 = fieldWeight in 3651, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.08300273 = weight(abstract_txt:classify in 3651) [ClassicSimilarity], result of:
            0.08300273 = score(doc=3651,freq=1.0), product of:
              0.3250114 = queryWeight, product of:
                2.8033667 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.017733112 = queryNorm
              0.25538406 = fieldWeight in 3651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.110418506 = weight(abstract_txt:sets in 3651) [ClassicSimilarity], result of:
            0.110418506 = score(doc=3651,freq=4.0), product of:
              0.27257824 = queryWeight, product of:
                2.96446 = boost
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.017733112 = queryNorm
              0.40508923 = fieldWeight in 3651, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
          0.2684226 = weight(abstract_txt:documents in 3651) [ClassicSimilarity], result of:
            0.2684226 = score(doc=3651,freq=15.0), product of:
              0.4305057 = queryWeight, product of:
                5.890599 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017733112 = queryNorm
              0.62350535 = fieldWeight in 3651, product of:
                3.8729835 = tf(freq=15.0), with freq of:
                  15.0 = termFreq=15.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3651)
        0.44 = coord(11/25)
    
  4. Liu, Y.; Xu, S.; Blanchard, E.: ¬A local context-aware LDA model for topic modeling in a document network (2017) 0.28
    0.277883 = sum of:
      0.277883 = product of:
        0.69470745 = sum of:
          0.014999404 = weight(abstract_txt:been in 3642) [ClassicSimilarity], result of:
            0.014999404 = score(doc=3642,freq=1.0), product of:
              0.066340074 = queryWeight, product of:
                1.0341249 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.017733112 = queryNorm
              0.22609869 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.01179215 = weight(abstract_txt:which in 3642) [ClassicSimilarity], result of:
            0.01179215 = score(doc=3642,freq=1.0), product of:
              0.064687304 = queryWeight, product of:
                1.2506626 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.017733112 = queryNorm
              0.18229467 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.04387768 = weight(abstract_txt:proposed in 3642) [ClassicSimilarity], result of:
            0.04387768 = score(doc=3642,freq=2.0), product of:
              0.10769918 = queryWeight, product of:
                1.317623 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.017733112 = queryNorm
              0.4074096 = fieldWeight in 3642, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.0711889 = weight(abstract_txt:topic in 3642) [ClassicSimilarity], result of:
            0.0711889 = score(doc=3642,freq=3.0), product of:
              0.12990555 = queryWeight, product of:
                1.447101 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.017733112 = queryNorm
              0.54800504 = fieldWeight in 3642, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.04295538 = weight(abstract_txt:multiple in 3642) [ClassicSimilarity], result of:
            0.04295538 = score(doc=3642,freq=1.0), product of:
              0.13378425 = queryWeight, product of:
                1.4685458 = boost
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.017733112 = queryNorm
              0.3210795 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.137272 = idf(docFreq=705, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.01337854 = weight(abstract_txt:from in 3642) [ClassicSimilarity], result of:
            0.01337854 = score(doc=3642,freq=1.0), product of:
              0.0774478 = queryWeight, product of:
                1.5801725 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017733112 = queryNorm
              0.17274266 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.030234814 = weight(abstract_txt:classification in 3642) [ClassicSimilarity], result of:
            0.030234814 = score(doc=3642,freq=1.0), product of:
              0.121179335 = queryWeight, product of:
                1.711768 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017733112 = queryNorm
              0.2495047 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.106321216 = weight(abstract_txt:document in 3642) [ClassicSimilarity], result of:
            0.106321216 = score(doc=3642,freq=8.0), product of:
              0.14011146 = queryWeight, product of:
                1.840634 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017733112 = queryNorm
              0.7588331 = fieldWeight in 3642, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.088334806 = weight(abstract_txt:sets in 3642) [ClassicSimilarity], result of:
            0.088334806 = score(doc=3642,freq=1.0), product of:
              0.27257824 = queryWeight, product of:
                2.96446 = boost
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.017733112 = queryNorm
              0.32407138 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.185142 = idf(docFreq=672, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
          0.27162457 = weight(abstract_txt:documents in 3642) [ClassicSimilarity], result of:
            0.27162457 = score(doc=3642,freq=6.0), product of:
              0.4305057 = queryWeight, product of:
                5.890599 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017733112 = queryNorm
              0.63094306 = fieldWeight in 3642, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=3642)
        0.4 = coord(10/25)
    
  5. Ringltetter, C.; Stubbe, A.: Practical aspects of automatic genre classification (2008) 0.26
    0.2568562 = sum of:
      0.2568562 = product of:
        0.9173436 = sum of:
          0.04110093 = weight(abstract_txt:topic in 1954) [ClassicSimilarity], result of:
            0.04110093 = score(doc=1954,freq=1.0), product of:
              0.12990555 = queryWeight, product of:
                1.447101 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.017733112 = queryNorm
              0.31639087 = fieldWeight in 1954, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.06284076 = weight(abstract_txt:automatic in 1954) [ClassicSimilarity], result of:
            0.06284076 = score(doc=1954,freq=2.0), product of:
              0.13683932 = queryWeight, product of:
                1.4852188 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.017733112 = queryNorm
              0.45923027 = fieldWeight in 1954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.018920112 = weight(abstract_txt:from in 1954) [ClassicSimilarity], result of:
            0.018920112 = score(doc=1954,freq=2.0), product of:
              0.0774478 = queryWeight, product of:
                1.5801725 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017733112 = queryNorm
              0.24429502 = fieldWeight in 1954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.05236823 = weight(abstract_txt:classification in 1954) [ClassicSimilarity], result of:
            0.05236823 = score(doc=1954,freq=3.0), product of:
              0.121179335 = queryWeight, product of:
                1.711768 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.017733112 = queryNorm
              0.4321548 = fieldWeight in 1954, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.053160608 = weight(abstract_txt:document in 1954) [ClassicSimilarity], result of:
            0.053160608 = score(doc=1954,freq=2.0), product of:
              0.14011146 = queryWeight, product of:
                1.840634 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.017733112 = queryNorm
              0.37941656 = fieldWeight in 1954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.4173284 = weight(abstract_txt:genre in 1954) [ClassicSimilarity], result of:
            0.4173284 = score(doc=1954,freq=5.0), product of:
              0.4488128 = queryWeight, product of:
                3.8039308 = boost
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.017733112 = queryNorm
              0.92984957 = fieldWeight in 1954, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
          0.27162457 = weight(abstract_txt:documents in 1954) [ClassicSimilarity], result of:
            0.27162457 = score(doc=1954,freq=6.0), product of:
              0.4305057 = queryWeight, product of:
                5.890599 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017733112 = queryNorm
              0.63094306 = fieldWeight in 1954, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=1954)
        0.28 = coord(7/25)