Document (#36120)

Moura, E.S. de
Fernandes, D.
Ribeiro-Neto, B.
Silva, A.S. da
Gonçalves, M.A.
Using structural information to improve search in Web collections
Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2503-2513
In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.

Similar documents (author)

  1. Calado, P.; Cristo, M.; Gonçalves, M.A.; Moura, E.S. de; Ribeiro-Neto, B.; Ziviani, N.: Link-based similarity measures for the classification of Web documents (2006) 2.43
    2.4273498 = sum of:
      2.4273498 = product of:
        3.6410246 = sum of:
          0.8119388 = weight(author_txt:gonçalves in 4921) [ClassicSimilarity], result of:
            0.8119388 = score(doc=4921,freq=1.0), product of:
              0.37934893 = queryWeight, product of:
                1.1404192 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.038853478 = queryNorm
              2.1403482 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.25 = fieldNorm(doc=4921)
          0.8523554 = weight(author_txt:moura in 4921) [ClassicSimilarity], result of:
            0.8523554 = score(doc=4921,freq=1.0), product of:
              0.39183554 = queryWeight, product of:
                1.1590363 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.038853478 = queryNorm
              2.1752887 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.25 = fieldNorm(doc=4921)
          0.8675183 = weight(author_txt:ribeiro in 4921) [ClassicSimilarity], result of:
            0.8675183 = score(doc=4921,freq=1.0), product of:
              0.39646888 = queryWeight, product of:
                1.1658688 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.038853478 = queryNorm
              2.188112 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.25 = fieldNorm(doc=4921)
          1.1092119 = weight(author_txt:neto in 4921) [ClassicSimilarity], result of:
            1.1092119 = score(doc=4921,freq=1.0), product of:
              0.4670532 = queryWeight, product of:
                1.2654014 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.038853478 = queryNorm
              2.3749156 = fieldWeight in 4921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.25 = fieldNorm(doc=4921)
        0.6666667 = coord(4/6)
  2. Couto, T.; Cristo, M.; Gonçalves, M.A.; Calado, P.; Ziviani, N.; Moura, E.; Ribeiro-Neto, B.: ¬A comparative study of citations and links in document classification (2006) 2.43
    2.4273498 = sum of:
      2.4273498 = product of:
        3.6410246 = sum of:
          0.8119388 = weight(author_txt:gonçalves in 2531) [ClassicSimilarity], result of:
            0.8119388 = score(doc=2531,freq=1.0), product of:
              0.37934893 = queryWeight, product of:
                1.1404192 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.038853478 = queryNorm
              2.1403482 = fieldWeight in 2531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.25 = fieldNorm(doc=2531)
          0.8523554 = weight(author_txt:moura in 2531) [ClassicSimilarity], result of:
            0.8523554 = score(doc=2531,freq=1.0), product of:
              0.39183554 = queryWeight, product of:
                1.1590363 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.038853478 = queryNorm
              2.1752887 = fieldWeight in 2531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.25 = fieldNorm(doc=2531)
          0.8675183 = weight(author_txt:ribeiro in 2531) [ClassicSimilarity], result of:
            0.8675183 = score(doc=2531,freq=1.0), product of:
              0.39646888 = queryWeight, product of:
                1.1658688 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.038853478 = queryNorm
              2.188112 = fieldWeight in 2531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.25 = fieldNorm(doc=2531)
          1.1092119 = weight(author_txt:neto in 2531) [ClassicSimilarity], result of:
            1.1092119 = score(doc=2531,freq=1.0), product of:
              0.4670532 = queryWeight, product of:
                1.2654014 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.038853478 = queryNorm
              2.3749156 = fieldWeight in 2531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.25 = fieldNorm(doc=2531)
        0.6666667 = coord(4/6)
  3. Pereira, D.A.; Ribeiro-Neto, B.; Ziviani, N.; Laender, A.H.F.; Gonçalves, M.A.: ¬A generic Web-based entity resolution framework (2011) 1.39
    1.3943346 = sum of:
      1.3943346 = product of:
        2.788669 = sum of:
          0.8119388 = weight(author_txt:gonçalves in 4450) [ClassicSimilarity], result of:
            0.8119388 = score(doc=4450,freq=1.0), product of:
              0.37934893 = queryWeight, product of:
                1.1404192 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.038853478 = queryNorm
              2.1403482 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
          0.8675183 = weight(author_txt:ribeiro in 4450) [ClassicSimilarity], result of:
            0.8675183 = score(doc=4450,freq=1.0), product of:
              0.39646888 = queryWeight, product of:
                1.1658688 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.038853478 = queryNorm
              2.188112 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
          1.1092119 = weight(author_txt:neto in 4450) [ClassicSimilarity], result of:
            1.1092119 = score(doc=4450,freq=1.0), product of:
              0.4670532 = queryWeight, product of:
                1.2654014 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.038853478 = queryNorm
              2.3749156 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
        0.5 = coord(3/6)
  4. Costa Carvalho, A. da; Rossi, C.; Moura, E.S. de; Silva, A.S. da; Fernandes, D.: LePrEF: Learn to precompute evidence fusion for efficient query evaluation (2012) 1.30
    1.2996906 = sum of:
      1.2996906 = product of:
        2.5993812 = sum of:
          0.54743135 = weight(author_txt:silva in 278) [ClassicSimilarity], result of:
            0.54743135 = score(doc=278,freq=1.0), product of:
              0.2916821 = queryWeight, product of:
                7.5072327 = idf(docFreq=65, maxDocs=44218)
                0.038853478 = queryNorm
              1.8768082 = fieldWeight in 278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5072327 = idf(docFreq=65, maxDocs=44218)
                0.25 = fieldNorm(doc=278)
          0.8523554 = weight(author_txt:moura in 278) [ClassicSimilarity], result of:
            0.8523554 = score(doc=278,freq=1.0), product of:
              0.39183554 = queryWeight, product of:
                1.1590363 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.038853478 = queryNorm
              2.1752887 = fieldWeight in 278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.25 = fieldNorm(doc=278)
          1.1995945 = weight(author_txt:fernandes in 278) [ClassicSimilarity], result of:
            1.1995945 = score(doc=278,freq=1.0), product of:
              0.492092 = queryWeight, product of:
                1.2988777 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.038853478 = queryNorm
              2.4377444 = fieldWeight in 278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.25 = fieldNorm(doc=278)
        0.5 = coord(3/6)
  5. Silveira, M.; Ribeiro-Neto, B.: Concept-based ranking : a case study in the juridical domain (2004) 1.15
    1.1530926 = sum of:
      1.1530926 = product of:
        3.4592779 = sum of:
          1.518157 = weight(author_txt:ribeiro in 2339) [ClassicSimilarity], result of:
            1.518157 = score(doc=2339,freq=1.0), product of:
              0.39646888 = queryWeight, product of:
                1.1658688 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.038853478 = queryNorm
              3.829196 = fieldWeight in 2339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.4375 = fieldNorm(doc=2339)
          1.9411209 = weight(author_txt:neto in 2339) [ClassicSimilarity], result of:
            1.9411209 = score(doc=2339,freq=1.0), product of:
              0.4670532 = queryWeight, product of:
                1.2654014 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.038853478 = queryNorm
              4.156102 = fieldWeight in 2339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.4375 = fieldNorm(doc=2339)
        0.33333334 = coord(2/6)

Similar documents (content)

  1. Fersini, E.; Messina, E.; Archetti, F.: Enhancing web page classification through image-block importance analysis (2008) 0.41
    0.4069486 = sum of:
      0.4069486 = product of:
        1.1304127 = sum of:
          0.047689747 = weight(abstract_txt:modified in 2102) [ClassicSimilarity], result of:
            0.047689747 = score(doc=2102,freq=1.0), product of:
              0.089078575 = queryWeight, product of:
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0129990475 = queryNorm
              0.5353672 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.050362263 = weight(abstract_txt:weighting in 2102) [ClassicSimilarity], result of:
            0.050362263 = score(doc=2102,freq=1.0), product of:
              0.092376195 = queryWeight, product of:
                1.0183414 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0129990475 = queryNorm
              0.5451866 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.008364356 = weight(abstract_txt:that in 2102) [ClassicSimilarity], result of:
            0.008364356 = score(doc=2102,freq=2.0), product of:
              0.03195033 = queryWeight, product of:
                1.0373174 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0129990475 = queryNorm
              0.26179248 = fieldWeight in 2102, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.06738191 = weight(abstract_txt:inverse in 2102) [ClassicSimilarity], result of:
            0.06738191 = score(doc=2102,freq=1.0), product of:
              0.112163655 = queryWeight, product of:
                1.1221204 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0129990475 = queryNorm
              0.6007464 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.06560704 = weight(abstract_txt:term in 2102) [ClassicSimilarity], result of:
            0.06560704 = score(doc=2102,freq=4.0), product of:
              0.08745411 = queryWeight, product of:
                1.4012592 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0129990475 = queryNorm
              0.75018823 = fieldWeight in 2102, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.018465588 = weight(abstract_txt:using in 2102) [ClassicSimilarity], result of:
            0.018465588 = score(doc=2102,freq=1.0), product of:
              0.068250485 = queryWeight, product of:
                1.5160966 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0129990475 = queryNorm
              0.27055615 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.11975692 = weight(abstract_txt:weight in 2102) [ClassicSimilarity], result of:
            0.11975692 = score(doc=2102,freq=1.0), product of:
              0.20734823 = queryWeight, product of:
                2.1576378 = boost
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.0129990475 = queryNorm
              0.57756424 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.23509274 = weight(abstract_txt:blocks in 2102) [ClassicSimilarity], result of:
            0.23509274 = score(doc=2102,freq=3.0), product of:
              0.22539917 = queryWeight, product of:
                2.2495959 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0129990475 = queryNorm
              1.0430063 = fieldWeight in 2102, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
          0.5176921 = weight(abstract_txt:block in 2102) [ClassicSimilarity], result of:
            0.5176921 = score(doc=2102,freq=3.0), product of:
              0.48067388 = queryWeight, product of:
                4.645887 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0129990475 = queryNorm
              1.0770131 = fieldWeight in 2102, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.078125 = fieldNorm(doc=2102)
        0.36 = coord(9/25)
  2. Wan, X.; Yang, J.; Xiao, J.: Towards a unified approach to document similarity search using manifold-ranking of blocks (2008) 0.36
    0.3606414 = sum of:
      0.3606414 = product of:
        1.1270044 = sum of:
          0.004731594 = weight(abstract_txt:that in 2081) [ClassicSimilarity], result of:
            0.004731594 = score(doc=2081,freq=1.0), product of:
              0.03195033 = queryWeight, product of:
                1.0373174 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0129990475 = queryNorm
              0.1480922 = fieldWeight in 2081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.053157825 = weight(abstract_txt:compute in 2081) [ClassicSimilarity], result of:
            0.053157825 = score(doc=2081,freq=1.0), product of:
              0.111124046 = queryWeight, product of:
                1.116908 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0129990475 = queryNorm
              0.47836474 = fieldWeight in 2081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.028932363 = weight(abstract_txt:improve in 2081) [ClassicSimilarity], result of:
            0.028932363 = score(doc=2081,freq=1.0), product of:
              0.0933317 = queryWeight, product of:
                1.4475813 = boost
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.0129990475 = queryNorm
              0.30999503 = fieldWeight in 2081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.01477247 = weight(abstract_txt:using in 2081) [ClassicSimilarity], result of:
            0.01477247 = score(doc=2081,freq=1.0), product of:
              0.068250485 = queryWeight, product of:
                1.5160966 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0129990475 = queryNorm
              0.21644491 = fieldWeight in 2081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.24280274 = weight(abstract_txt:blocks in 2081) [ClassicSimilarity], result of:
            0.24280274 = score(doc=2081,freq=5.0), product of:
              0.22539917 = queryWeight, product of:
                2.2495959 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0129990475 = queryNorm
              1.0772122 = fieldWeight in 2081, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.062648706 = weight(abstract_txt:pages in 2081) [ClassicSimilarity], result of:
            0.062648706 = score(doc=2081,freq=1.0), product of:
              0.17881821 = queryWeight, product of:
                2.45403 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0129990475 = queryNorm
              0.3503486 = fieldWeight in 2081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.41415367 = weight(abstract_txt:block in 2081) [ClassicSimilarity], result of:
            0.41415367 = score(doc=2081,freq=3.0), product of:
              0.48067388 = queryWeight, product of:
                4.645887 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0129990475 = queryNorm
              0.86161053 = fieldWeight in 2081, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
          0.30580503 = weight(abstract_txt:ranking in 2081) [ClassicSimilarity], result of:
            0.30580503 = score(doc=2081,freq=6.0), product of:
              0.35677385 = queryWeight, product of:
                4.9021378 = boost
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.0129990475 = queryNorm
              0.8571397 = fieldWeight in 2081, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.0625 = fieldNorm(doc=2081)
        0.32 = coord(8/25)
  3. Dang, E.K.F.; Luk, R.W.P.; Allan, J.; Ho, K.S.; Chung, K.F.L.; Lee, D.L.: ¬A new context-dependent term weight computed by boost and discount using relevance information (2010) 0.24
    0.23742363 = sum of:
      0.23742363 = product of:
        0.74194884 = sum of:
          0.040289808 = weight(abstract_txt:weighting in 4120) [ClassicSimilarity], result of:
            0.040289808 = score(doc=4120,freq=1.0), product of:
              0.092376195 = queryWeight, product of:
                1.0183414 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0129990475 = queryNorm
              0.43614927 = fieldWeight in 4120, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.053157825 = weight(abstract_txt:compute in 4120) [ClassicSimilarity], result of:
            0.053157825 = score(doc=4120,freq=1.0), product of:
              0.111124046 = queryWeight, product of:
                1.116908 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0129990475 = queryNorm
              0.47836474 = fieldWeight in 4120, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.05390553 = weight(abstract_txt:inverse in 4120) [ClassicSimilarity], result of:
            0.05390553 = score(doc=4120,freq=1.0), product of:
              0.112163655 = queryWeight, product of:
                1.1221204 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0129990475 = queryNorm
              0.48059714 = fieldWeight in 4120, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.06943197 = weight(abstract_txt:term in 4120) [ClassicSimilarity], result of:
            0.06943197 = score(doc=4120,freq=7.0), product of:
              0.08745411 = queryWeight, product of:
                1.4012592 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0129990475 = queryNorm
              0.79392457 = fieldWeight in 4120, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.025586668 = weight(abstract_txt:using in 4120) [ClassicSimilarity], result of:
            0.025586668 = score(doc=4120,freq=3.0), product of:
              0.068250485 = queryWeight, product of:
                1.5160966 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0129990475 = queryNorm
              0.37489358 = fieldWeight in 4120, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.070546724 = weight(abstract_txt:frequency in 4120) [ClassicSimilarity], result of:
            0.070546724 = score(doc=4120,freq=2.0), product of:
              0.13419855 = queryWeight, product of:
                1.7358104 = boost
                5.947494 = idf(docFreq=313, maxDocs=44218)
                0.0129990475 = queryNorm
              0.5256892 = fieldWeight in 4120, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.947494 = idf(docFreq=313, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.06370764 = weight(abstract_txt:collections in 4120) [ClassicSimilarity], result of:
            0.06370764 = score(doc=4120,freq=3.0), product of:
              0.12537885 = queryWeight, product of:
                2.0548785 = boost
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.0129990475 = queryNorm
              0.50812113 = fieldWeight in 4120, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
          0.36532265 = weight(abstract_txt:bm25 in 4120) [ClassicSimilarity], result of:
            0.36532265 = score(doc=4120,freq=2.0), product of:
              0.45980963 = queryWeight, product of:
                3.9351656 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0129990475 = queryNorm
              0.79450846 = fieldWeight in 4120, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0625 = fieldNorm(doc=4120)
        0.32 = coord(8/25)
  4. Trotman, A.: Choosing document structure weights (2005) 0.23
    0.23296008 = sum of:
      0.23296008 = product of:
        0.970667 = sum of:
          0.07122299 = weight(abstract_txt:weighting in 1016) [ClassicSimilarity], result of:
            0.07122299 = score(doc=1016,freq=2.0), product of:
              0.092376195 = queryWeight, product of:
                1.0183414 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0129990475 = queryNorm
              0.7710102 = fieldWeight in 1016, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
          0.060546692 = weight(abstract_txt:occurrences in 1016) [ClassicSimilarity], result of:
            0.060546692 = score(doc=1016,freq=1.0), product of:
              0.10444401 = queryWeight, product of:
                1.0828172 = boost
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.0129990475 = queryNorm
              0.57970476 = fieldWeight in 1016, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
          0.03280352 = weight(abstract_txt:term in 1016) [ClassicSimilarity], result of:
            0.03280352 = score(doc=1016,freq=1.0), product of:
              0.08745411 = queryWeight, product of:
                1.4012592 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0129990475 = queryNorm
              0.37509412 = fieldWeight in 1016, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
          0.026114283 = weight(abstract_txt:using in 1016) [ClassicSimilarity], result of:
            0.026114283 = score(doc=1016,freq=2.0), product of:
              0.068250485 = queryWeight, product of:
                1.5160966 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0129990475 = queryNorm
              0.38262415 = fieldWeight in 1016, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
          0.5592838 = weight(abstract_txt:bm25 in 1016) [ClassicSimilarity], result of:
            0.5592838 = score(doc=1016,freq=3.0), product of:
              0.45980963 = queryWeight, product of:
                3.9351656 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0129990475 = queryNorm
              1.2163377 = fieldWeight in 1016, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
          0.22069576 = weight(abstract_txt:ranking in 1016) [ClassicSimilarity], result of:
            0.22069576 = score(doc=1016,freq=2.0), product of:
              0.35677385 = queryWeight, product of:
                4.9021378 = boost
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.0129990475 = queryNorm
              0.61858726 = fieldWeight in 1016, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.078125 = fieldNorm(doc=1016)
        0.24 = coord(6/25)
  5. Alzahrani, S.; Palade, V.; Salim, N.; Abraham, A.: Using structural information and citation evidence to detect significant plagiarism cases in scientific publications (2012) 0.22
    0.21681367 = sum of:
      0.21681367 = product of:
        0.6022602 = sum of:
          0.049856097 = weight(abstract_txt:weighting in 4982) [ClassicSimilarity], result of:
            0.049856097 = score(doc=4982,freq=2.0), product of:
              0.092376195 = queryWeight, product of:
                1.0183414 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0129990475 = queryNorm
              0.5397072 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.005855049 = weight(abstract_txt:that in 4982) [ClassicSimilarity], result of:
            0.005855049 = score(doc=4982,freq=2.0), product of:
              0.03195033 = queryWeight, product of:
                1.0373174 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0129990475 = queryNorm
              0.18325473 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.047167342 = weight(abstract_txt:inverse in 4982) [ClassicSimilarity], result of:
            0.047167342 = score(doc=4982,freq=1.0), product of:
              0.112163655 = queryWeight, product of:
                1.1221204 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0129990475 = queryNorm
              0.4205225 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.022962466 = weight(abstract_txt:term in 4982) [ClassicSimilarity], result of:
            0.022962466 = score(doc=4982,freq=1.0), product of:
              0.08745411 = queryWeight, product of:
                1.4012592 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0129990475 = queryNorm
              0.26256588 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.035801973 = weight(abstract_txt:improve in 4982) [ClassicSimilarity], result of:
            0.035801973 = score(doc=4982,freq=2.0), product of:
              0.0933317 = queryWeight, product of:
                1.4475813 = boost
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.0129990475 = queryNorm
              0.38359928 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9599204 = idf(docFreq=842, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.028903218 = weight(abstract_txt:using in 4982) [ClassicSimilarity], result of:
            0.028903218 = score(doc=4982,freq=5.0), product of:
              0.068250485 = queryWeight, product of:
                1.5160966 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0129990475 = queryNorm
              0.42348737 = fieldWeight in 4982, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.043648556 = weight(abstract_txt:frequency in 4982) [ClassicSimilarity], result of:
            0.043648556 = score(doc=4982,freq=1.0), product of:
              0.13419855 = queryWeight, product of:
                1.7358104 = boost
                5.947494 = idf(docFreq=313, maxDocs=44218)
                0.0129990475 = queryNorm
              0.32525358 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.947494 = idf(docFreq=313, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.16765969 = weight(abstract_txt:weight in 4982) [ClassicSimilarity], result of:
            0.16765969 = score(doc=4982,freq=4.0), product of:
              0.20734823 = queryWeight, product of:
                2.1576378 = boost
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.0129990475 = queryNorm
              0.80858994 = fieldWeight in 4982, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.20040579 = weight(abstract_txt:baselines in 4982) [ClassicSimilarity], result of:
            0.20040579 = score(doc=4982,freq=3.0), product of:
              0.25704017 = queryWeight, product of:
                2.402309 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0129990475 = queryNorm
              0.7796672 = fieldWeight in 4982, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
        0.36 = coord(9/25)