Document (#32955)

Author
Shen, D.
Yang, Q.
Chen, Z.
Title
Noise reduction through summarization for Web-page classification
Source
Information processing and management. 43(2007) no.6, S.1735-1747
Year
2007
Abstract
Due to a large variety of noisy information embedded in Web pages, Web-page classification is much more difficult than pure-text classification. In this paper, we propose to improve the Web-page classification performance by removing the noise through summarization techniques. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then put forward a new Web-page summarization algorithm based on Web-page layout and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that the classification algorithms (NB or SVM) augmented by any summarization approach can achieve an improvement by more than 5.0% as compared to pure-text-based classification algorithms. We further introduce an ensemble method to combine the different summarization algorithms. The ensemble summarization method achieves more than 12.0% improvement over pure-text based methods.
Theme
Automatisches Abstracting

Similar documents (author)

  1. Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 3.18
    3.1778126 = sum of:
      3.1778126 = sum of:
        0.59864455 = weight(author_txt:chen in 133) [ClassicSimilarity], result of:
          0.59864455 = score(doc=133,freq=1.0), product of:
            0.38811097 = queryWeight, product of:
              6.169829 = idf(docFreq=242, maxDocs=42740)
              0.062904656 = queryNorm
            1.5424572 = fieldWeight in 133, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.169829 = idf(docFreq=242, maxDocs=42740)
              0.25 = fieldNorm(doc=133)
        0.9738056 = weight(author_txt:yang in 133) [ClassicSimilarity], result of:
          0.9738056 = score(doc=133,freq=1.0), product of:
            0.5368151 = queryWeight, product of:
              1.1760733 = boost
              7.256171 = idf(docFreq=81, maxDocs=42740)
              0.062904656 = queryNorm
            1.8140428 = fieldWeight in 133, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.256171 = idf(docFreq=81, maxDocs=42740)
              0.25 = fieldNorm(doc=133)
        1.6053623 = weight(author_txt:shen in 133) [ClassicSimilarity], result of:
          1.6053623 = score(doc=133,freq=1.0), product of:
            0.7491324 = queryWeight, product of:
              1.3893169 = boost
              8.571848 = idf(docFreq=21, maxDocs=42740)
              0.062904656 = queryNorm
            2.142962 = fieldWeight in 133, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.571848 = idf(docFreq=21, maxDocs=42740)
              0.25 = fieldNorm(doc=133)
    
  2. Chen, Y.-H.; Germain, C.A.; Yang, H.: ¬An exploration into the practices of library Web usability in ARL academic libraries (2009) 1.57
    1.57245 = sum of:
      1.57245 = product of:
        2.358675 = sum of:
          0.89796686 = weight(author_txt:chen in 4799) [ClassicSimilarity], result of:
            0.89796686 = score(doc=4799,freq=1.0), product of:
              0.38811097 = queryWeight, product of:
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.062904656 = queryNorm
              2.313686 = fieldWeight in 4799, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.375 = fieldNorm(doc=4799)
          1.4607083 = weight(author_txt:yang in 4799) [ClassicSimilarity], result of:
            1.4607083 = score(doc=4799,freq=1.0), product of:
              0.5368151 = queryWeight, product of:
                1.1760733 = boost
                7.256171 = idf(docFreq=81, maxDocs=42740)
                0.062904656 = queryNorm
              2.721064 = fieldWeight in 4799, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.256171 = idf(docFreq=81, maxDocs=42740)
                0.375 = fieldNorm(doc=4799)
        0.6666667 = coord(2/3)
    
  3. Liu, D.-R.; Chen, Y.-H.; Shen, M.; Lu, P.-J.: Complementary QA network analysis for QA retrieval in social question-answering websites (2015) 1.47
    1.4693379 = sum of:
      1.4693379 = product of:
        2.204007 = sum of:
          0.59864455 = weight(author_txt:chen in 3612) [ClassicSimilarity], result of:
            0.59864455 = score(doc=3612,freq=1.0), product of:
              0.38811097 = queryWeight, product of:
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.062904656 = queryNorm
              1.5424572 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.25 = fieldNorm(doc=3612)
          1.6053623 = weight(author_txt:shen in 3612) [ClassicSimilarity], result of:
            1.6053623 = score(doc=3612,freq=1.0), product of:
              0.7491324 = queryWeight, product of:
                1.3893169 = boost
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.062904656 = queryNorm
              2.142962 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.25 = fieldNorm(doc=3612)
        0.6666667 = coord(2/3)
    
  4. Shen, X.-L.; Li, Y.-J.; Sun, Y.; Chen, J.; Wang, F.: Knowledge withholding in online knowledge spaces : social deviance behavior and secondary control perspective (2019) 1.47
    1.4693379 = sum of:
      1.4693379 = product of:
        2.204007 = sum of:
          0.59864455 = weight(author_txt:chen in 1017) [ClassicSimilarity], result of:
            0.59864455 = score(doc=1017,freq=1.0), product of:
              0.38811097 = queryWeight, product of:
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.062904656 = queryNorm
              1.5424572 = fieldWeight in 1017, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.169829 = idf(docFreq=242, maxDocs=42740)
                0.25 = fieldNorm(doc=1017)
          1.6053623 = weight(author_txt:shen in 1017) [ClassicSimilarity], result of:
            1.6053623 = score(doc=1017,freq=1.0), product of:
              0.7491324 = queryWeight, product of:
                1.3893169 = boost
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.062904656 = queryNorm
              2.142962 = fieldWeight in 1017, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.25 = fieldNorm(doc=1017)
        0.6666667 = coord(2/3)
    
  5. Shen, Z.: CJK: the unique need of Chinese, Japanese, and Korean language cataloging (1993) 1.34
    1.3378018 = sum of:
      1.3378018 = product of:
        4.0134053 = sum of:
          4.0134053 = weight(author_txt:shen in 4727) [ClassicSimilarity], result of:
            4.0134053 = score(doc=4727,freq=1.0), product of:
              0.7491324 = queryWeight, product of:
                1.3893169 = boost
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.062904656 = queryNorm
              5.3574047 = fieldWeight in 4727, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.571848 = idf(docFreq=21, maxDocs=42740)
                0.625 = fieldNorm(doc=4727)
        0.33333334 = coord(1/3)
    

Similar documents (content)

  1. Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.22
    0.21894386 = sum of:
      0.21894386 = product of:
        0.78194237 = sum of:
          0.051226165 = weight(abstract_txt:summaries in 2956) [ClassicSimilarity], result of:
            0.051226165 = score(doc=2956,freq=2.0), product of:
              0.08257575 = queryWeight, product of:
                1.0316653 = boost
                7.0185 = idf(docFreq=103, maxDocs=42740)
                0.01140432 = queryNorm
              0.6203536 = fieldWeight in 2956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0185 = idf(docFreq=103, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.055501085 = weight(abstract_txt:reduction in 2956) [ClassicSimilarity], result of:
            0.055501085 = score(doc=2956,freq=2.0), product of:
              0.08710818 = queryWeight, product of:
                1.0596002 = boost
                7.2085433 = idf(docFreq=85, maxDocs=42740)
                0.01140432 = queryNorm
              0.63715124 = fieldWeight in 2956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2085433 = idf(docFreq=85, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.04745032 = weight(abstract_txt:method in 2956) [ClassicSimilarity], result of:
            0.04745032 = score(doc=2956,freq=6.0), product of:
              0.068546765 = queryWeight, product of:
                1.329294 = boost
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.01140432 = queryNorm
              0.6922328 = fieldWeight in 2956, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.020969387 = weight(abstract_txt:performance in 2956) [ClassicSimilarity], result of:
            0.020969387 = score(doc=2956,freq=1.0), product of:
              0.072266184 = queryWeight, product of:
                1.3648821 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.01140432 = queryNorm
              0.29016873 = fieldWeight in 2956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.010385168 = weight(abstract_txt:based in 2956) [ClassicSimilarity], result of:
            0.010385168 = score(doc=2956,freq=1.0), product of:
              0.051782627 = queryWeight, product of:
                1.4150287 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.01140432 = queryNorm
              0.20055313 = fieldWeight in 2956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.06225483 = weight(abstract_txt:text in 2956) [ClassicSimilarity], result of:
            0.06225483 = score(doc=2956,freq=5.0), product of:
              0.10998835 = queryWeight, product of:
                2.38131 = boost
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.01140432 = queryNorm
              0.566013 = fieldWeight in 2956, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
          0.5341554 = weight(abstract_txt:summarization in 2956) [ClassicSimilarity], result of:
            0.5341554 = score(doc=2956,freq=4.0), product of:
              0.5984011 = queryWeight, product of:
                7.347808 = boost
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.01140432 = queryNorm
              0.8926377 = fieldWeight in 2956, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.0625 = fieldNorm(doc=2956)
        0.28 = coord(7/25)
    
  2. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.22
    0.2181573 = sum of:
      0.2181573 = product of:
        0.9089888 = sum of:
          0.020969387 = weight(abstract_txt:performance in 4694) [ClassicSimilarity], result of:
            0.020969387 = score(doc=4694,freq=1.0), product of:
              0.072266184 = queryWeight, product of:
                1.3648821 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.01140432 = queryNorm
              0.29016873 = fieldWeight in 4694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
          0.017987639 = weight(abstract_txt:based in 4694) [ClassicSimilarity], result of:
            0.017987639 = score(doc=4694,freq=3.0), product of:
              0.051782627 = queryWeight, product of:
                1.4150287 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.01140432 = queryNorm
              0.3473682 = fieldWeight in 4694, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
          0.026209204 = weight(abstract_txt:improve in 4694) [ClassicSimilarity], result of:
            0.026209204 = score(doc=4694,freq=1.0), product of:
              0.08385208 = queryWeight, product of:
                1.4702274 = boost
                5.0010357 = idf(docFreq=781, maxDocs=42740)
                0.01140432 = queryNorm
              0.31256473 = fieldWeight in 4694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0010357 = idf(docFreq=781, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
          0.048222385 = weight(abstract_txt:text in 4694) [ClassicSimilarity], result of:
            0.048222385 = score(doc=4694,freq=3.0), product of:
              0.10998835 = queryWeight, product of:
                2.38131 = boost
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.01140432 = queryNorm
              0.43843177 = fieldWeight in 4694, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
          0.14139605 = weight(abstract_txt:algorithms in 4694) [ClassicSimilarity], result of:
            0.14139605 = score(doc=4694,freq=2.0), product of:
              0.2778473 = queryWeight, product of:
                4.2315617 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.01140432 = queryNorm
              0.50889844 = fieldWeight in 4694, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
          0.65420413 = weight(abstract_txt:summarization in 4694) [ClassicSimilarity], result of:
            0.65420413 = score(doc=4694,freq=6.0), product of:
              0.5984011 = queryWeight, product of:
                7.347808 = boost
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.01140432 = queryNorm
              1.0932535 = fieldWeight in 4694, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.0625 = fieldNorm(doc=4694)
        0.24 = coord(6/25)
    
  3. Agarwal, B.; Ramampiaro, H.; Langseth, H.; Ruocco, M.: ¬A deep network model for paraphrase detection in short text messages (2018) 0.21
    0.20934999 = sum of:
      0.20934999 = product of:
        0.5815277 = sum of:
          0.04455076 = weight(abstract_txt:achieves in 1044) [ClassicSimilarity], result of:
            0.04455076 = score(doc=1044,freq=1.0), product of:
              0.094791934 = queryWeight, product of:
                1.1053461 = boost
                7.519756 = idf(docFreq=62, maxDocs=42740)
                0.01140432 = queryNorm
              0.46998474 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.519756 = idf(docFreq=62, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.059081446 = weight(abstract_txt:noisy in 1044) [ClassicSimilarity], result of:
            0.059081446 = score(doc=1044,freq=1.0), product of:
              0.11442003 = queryWeight, product of:
                1.2144052 = boost
                8.261693 = idf(docFreq=29, maxDocs=42740)
                0.01140432 = queryNorm
              0.5163558 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.261693 = idf(docFreq=29, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.020969387 = weight(abstract_txt:performance in 1044) [ClassicSimilarity], result of:
            0.020969387 = score(doc=1044,freq=1.0), product of:
              0.072266184 = queryWeight, product of:
                1.3648821 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.01140432 = queryNorm
              0.29016873 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.014686844 = weight(abstract_txt:based in 1044) [ClassicSimilarity], result of:
            0.014686844 = score(doc=1044,freq=2.0), product of:
              0.051782627 = queryWeight, product of:
                1.4150287 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.01140432 = queryNorm
              0.28362495 = fieldWeight in 1044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.017869772 = weight(abstract_txt:more in 1044) [ClassicSimilarity], result of:
            0.017869772 = score(doc=1044,freq=2.0), product of:
              0.05901708 = queryWeight, product of:
                1.5106437 = boost
                3.4256759 = idf(docFreq=3778, maxDocs=42740)
                0.01140432 = queryNorm
              0.30278984 = fieldWeight in 1044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4256759 = idf(docFreq=3778, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.018796653 = weight(abstract_txt:than in 1044) [ClassicSimilarity], result of:
            0.018796653 = score(doc=1044,freq=1.0), product of:
              0.07690632 = queryWeight, product of:
                1.7244644 = boost
                3.9105554 = idf(docFreq=2326, maxDocs=42740)
                0.01140432 = queryNorm
              0.24440971 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9105554 = idf(docFreq=2326, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.09912173 = weight(abstract_txt:noise in 1044) [ClassicSimilarity], result of:
            0.09912173 = score(doc=1044,freq=1.0), product of:
              0.20354348 = queryWeight, product of:
                2.2906365 = boost
                7.7916894 = idf(docFreq=47, maxDocs=42740)
                0.01140432 = queryNorm
              0.4869806 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7916894 = idf(docFreq=47, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.039373413 = weight(abstract_txt:text in 1044) [ClassicSimilarity], result of:
            0.039373413 = score(doc=1044,freq=2.0), product of:
              0.10998835 = queryWeight, product of:
                2.38131 = boost
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.01140432 = queryNorm
              0.35797805 = fieldWeight in 1044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
          0.2670777 = weight(abstract_txt:summarization in 1044) [ClassicSimilarity], result of:
            0.2670777 = score(doc=1044,freq=1.0), product of:
              0.5984011 = queryWeight, product of:
                7.347808 = boost
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.01140432 = queryNorm
              0.44631886 = fieldWeight in 1044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.0625 = fieldNorm(doc=1044)
        0.36 = coord(9/25)
    
  4. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.21
    0.20555262 = sum of:
      0.20555262 = product of:
        1.0277631 = sum of:
          0.07842373 = weight(abstract_txt:summaries in 2564) [ClassicSimilarity], result of:
            0.07842373 = score(doc=2564,freq=3.0), product of:
              0.08257575 = queryWeight, product of:
                1.0316653 = boost
                7.0185 = idf(docFreq=103, maxDocs=42740)
                0.01140432 = queryNorm
              0.9497186 = fieldWeight in 2564, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.0185 = idf(docFreq=103, maxDocs=42740)
                0.078125 = fieldNorm(doc=2564)
          0.034801513 = weight(abstract_txt:text in 2564) [ClassicSimilarity], result of:
            0.034801513 = score(doc=2564,freq=1.0), product of:
              0.10998835 = queryWeight, product of:
                2.38131 = boost
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.01140432 = queryNorm
              0.3164109 = fieldWeight in 2564, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0500593 = idf(docFreq=2023, maxDocs=42740)
                0.078125 = fieldNorm(doc=2564)
          0.058670066 = weight(abstract_txt:classification in 2564) [ClassicSimilarity], result of:
            0.058670066 = score(doc=2564,freq=1.0), product of:
              0.18774644 = queryWeight, product of:
                4.1157355 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.01140432 = queryNorm
              0.3124963 = fieldWeight in 2564, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.078125 = fieldNorm(doc=2564)
          0.27762756 = weight(abstract_txt:page in 2564) [ClassicSimilarity], result of:
            0.27762756 = score(doc=2564,freq=2.0), product of:
              0.42000937 = queryWeight, product of:
                6.1558933 = boost
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.01140432 = queryNorm
              0.66100323 = fieldWeight in 2564, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.078125 = fieldNorm(doc=2564)
          0.5782402 = weight(abstract_txt:summarization in 2564) [ClassicSimilarity], result of:
            0.5782402 = score(doc=2564,freq=3.0), product of:
              0.5984011 = queryWeight, product of:
                7.347808 = boost
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.01140432 = queryNorm
              0.96630865 = fieldWeight in 2564, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.141102 = idf(docFreq=91, maxDocs=42740)
                0.078125 = fieldNorm(doc=2564)
        0.2 = coord(5/25)
    
  5. Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.18
    0.1770865 = sum of:
      0.1770865 = product of:
        0.6324518 = sum of:
          0.03298833 = weight(abstract_txt:directory in 3071) [ClassicSimilarity], result of:
            0.03298833 = score(doc=3071,freq=1.0), product of:
              0.077584475 = queryWeight, product of:
                6.803078 = idf(docFreq=128, maxDocs=42740)
                0.01140432 = queryNorm
              0.4251924 = fieldWeight in 3071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.803078 = idf(docFreq=128, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.03355244 = weight(abstract_txt:method in 3071) [ClassicSimilarity], result of:
            0.03355244 = score(doc=3071,freq=3.0), product of:
              0.068546765 = queryWeight, product of:
                1.329294 = boost
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.01140432 = queryNorm
              0.4894825 = fieldWeight in 3071, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.02965519 = weight(abstract_txt:performance in 3071) [ClassicSimilarity], result of:
            0.02965519 = score(doc=3071,freq=2.0), product of:
              0.072266184 = queryWeight, product of:
                1.3648821 = boost
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.01140432 = queryNorm
              0.41036054 = fieldWeight in 3071, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6426997 = idf(docFreq=1118, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.010385168 = weight(abstract_txt:based in 3071) [ClassicSimilarity], result of:
            0.010385168 = score(doc=3071,freq=1.0), product of:
              0.051782627 = queryWeight, product of:
                1.4150287 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.01140432 = queryNorm
              0.20055313 = fieldWeight in 3071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.026209204 = weight(abstract_txt:improve in 3071) [ClassicSimilarity], result of:
            0.026209204 = score(doc=3071,freq=1.0), product of:
              0.08385208 = queryWeight, product of:
                1.4702274 = boost
                5.0010357 = idf(docFreq=781, maxDocs=42740)
                0.01140432 = queryNorm
              0.31256473 = fieldWeight in 3071, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0010357 = idf(docFreq=781, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.114969395 = weight(abstract_txt:classification in 3071) [ClassicSimilarity], result of:
            0.114969395 = score(doc=3071,freq=6.0), product of:
              0.18774644 = queryWeight, product of:
                4.1157355 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.01140432 = queryNorm
              0.61236525 = fieldWeight in 3071, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
          0.384692 = weight(abstract_txt:page in 3071) [ClassicSimilarity], result of:
            0.384692 = score(doc=3071,freq=6.0), product of:
              0.42000937 = queryWeight, product of:
                6.1558933 = boost
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.01140432 = queryNorm
              0.9159129 = fieldWeight in 3071, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.0625 = fieldNorm(doc=3071)
        0.28 = coord(7/25)