Document (#32609)

Author
Liu, Y.
Zhang, M.
Cen, R.
Ru, L.
Ma, S.
Title
Data cleansing for Web information retrieval using query independent features
Source
Journal of the American Society for Information Science and Technology. 58(2007) no.12, S.1884-1898
Year
2007
Abstract
Understanding what kinds of Web pages are the most useful for Web search engine users is a critical task in Web information retrieval (IR). Most previous works used hyperlink analysis algorithms to solve this problem. However, little research has been focused on query-independent Web data cleansing for Web IR. In this paper, we first provide analysis of the differences between retrieval target pages and ordinary ones based on more than 30 million Web pages obtained from both the Text Retrieval Conference (TREC) and a widely used Chinese search engine, SOGOU (www.sogou.com). We further propose a learning-based data cleansing algorithm for reducing Web pages that are unlikely to be useful for user requests. We found that there exists a large proportion of low-quality Web pages in both the English and the Chinese Web page corpus, and retrieval target pages can be identified using query-independent features and cleansing algorithms. The experimental results showed that our algorithm is effective in reducing a large portion of Web pages with a small loss in retrieval target pages. It makes it possible for Web IR tools to meet a large fraction of users' needs with only a small part of pages on the Web. These results may help Web search engines make better use of their limited storage and computation resources to improve search performance.
Footnote
Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"
Theme
Data Mining
Suchmaschinen
Object
WWW

Similar documents (author)

  1. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 4.11
    4.1057124 = sum of:
      4.1057124 = weight(author_txt:zhang in 7711) [ClassicSimilarity], result of:
        4.1057124 = score(doc=7711,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.15222691 = queryNorm
          4.105713 = fieldWeight in 7711, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.625 = fieldNorm(doc=7711)
    
  2. Zhang, A.: Multimedia file formats on the Internet : a beginner's guide for PC users (1995) 4.11
    4.1057124 = sum of:
      4.1057124 = weight(author_txt:zhang in 3281) [ClassicSimilarity], result of:
        4.1057124 = score(doc=3281,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.15222691 = queryNorm
          4.105713 = fieldWeight in 3281, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.625 = fieldNorm(doc=3281)
    
  3. Zhang, J.: ¬A representational analysis of relational information displays (1996) 4.11
    4.1057124 = sum of:
      4.1057124 = weight(author_txt:zhang in 6472) [ClassicSimilarity], result of:
        4.1057124 = score(doc=6472,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.15222691 = queryNorm
          4.105713 = fieldWeight in 6472, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.625 = fieldNorm(doc=6472)
    
  4. Zhang, Y.: ¬The impact of Internet-based electronic resources on formal scholarly communication in the area of library and information science : a citation analysis (1998) 4.11
    4.1057124 = sum of:
      4.1057124 = weight(author_txt:zhang in 3809) [ClassicSimilarity], result of:
        4.1057124 = score(doc=3809,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.15222691 = queryNorm
          4.105713 = fieldWeight in 3809, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.625 = fieldNorm(doc=3809)
    
  5. Zhang, Y.: Using the Internet for survey research : a case study (2000) 4.11
    4.1057124 = sum of:
      4.1057124 = weight(author_txt:zhang in 5295) [ClassicSimilarity], result of:
        4.1057124 = score(doc=5295,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.15222691 = queryNorm
          4.105713 = fieldWeight in 5295, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.5691404 = idf(docFreq=162, maxDocs=42740)
            0.625 = fieldNorm(doc=5295)
    

Similar documents (content)

  1. Souza, J.; Carvalho, A.; Cristo, M.; Moura, E.; Calado, P.; Chirita, P.-A.; Nejdl, W.: Using site-level connections to estimate link confidence (2012) 0.26
    0.25966126 = sum of:
      0.25966126 = product of:
        0.6491531 = sum of:
          0.013565819 = weight(abstract_txt:most in 2499) [ClassicSimilarity], result of:
            0.013565819 = score(doc=2499,freq=1.0), product of:
              0.05480051 = queryWeight, product of:
                1.0724792 = boost
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.012900731 = queryNorm
              0.24754913 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.02931525 = weight(abstract_txt:features in 2499) [ClassicSimilarity], result of:
            0.02931525 = score(doc=2499,freq=2.0), product of:
              0.07270088 = queryWeight, product of:
                1.2352829 = boost
                4.5620384 = idf(docFreq=1212, maxDocs=42740)
                0.012900731 = queryNorm
              0.40323102 = fieldWeight in 2499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5620384 = idf(docFreq=1212, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.036652807 = weight(abstract_txt:engine in 2499) [ClassicSimilarity], result of:
            0.036652807 = score(doc=2499,freq=1.0), product of:
              0.10630624 = queryWeight, product of:
                1.4937433 = boost
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.012900731 = queryNorm
              0.3447851 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.040861163 = weight(abstract_txt:algorithm in 2499) [ClassicSimilarity], result of:
            0.040861163 = score(doc=2499,freq=1.0), product of:
              0.11429514 = queryWeight, product of:
                1.548854 = boost
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.012900731 = queryNorm
              0.3575057 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.041668724 = weight(abstract_txt:algorithms in 2499) [ClassicSimilarity], result of:
            0.041668724 = score(doc=2499,freq=1.0), product of:
              0.115796134 = queryWeight, product of:
                1.5589911 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.012900731 = queryNorm
              0.35984555 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.02921897 = weight(abstract_txt:large in 2499) [ClassicSimilarity], result of:
            0.02921897 = score(doc=2499,freq=1.0), product of:
              0.1046231 = queryWeight, product of:
                1.8149139 = boost
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.012900731 = queryNorm
              0.27927837 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.049115714 = weight(abstract_txt:query in 2499) [ClassicSimilarity], result of:
            0.049115714 = score(doc=2499,freq=2.0), product of:
              0.11739637 = queryWeight, product of:
                1.9225142 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.012900731 = queryNorm
              0.41837507 = fieldWeight in 2499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.047538277 = weight(abstract_txt:search in 2499) [ClassicSimilarity], result of:
            0.047538277 = score(doc=2499,freq=5.0), product of:
              0.09315429 = queryWeight, product of:
                1.9774842 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.012900731 = queryNorm
              0.5103176 = fieldWeight in 2499, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.06473164 = weight(abstract_txt:independent in 2499) [ClassicSimilarity], result of:
            0.06473164 = score(doc=2499,freq=1.0), product of:
              0.17779878 = queryWeight, product of:
                2.3659556 = boost
                5.82516 = idf(docFreq=342, maxDocs=42740)
                0.012900731 = queryNorm
              0.3640725 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.82516 = idf(docFreq=342, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
          0.2964848 = weight(abstract_txt:pages in 2499) [ClassicSimilarity], result of:
            0.2964848 = score(doc=2499,freq=3.0), product of:
              0.49036482 = queryWeight, product of:
                6.8055387 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.012900731 = queryNorm
              0.6046209 = fieldWeight in 2499, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0625 = fieldNorm(doc=2499)
        0.4 = coord(10/25)
    
  2. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.19
    0.18752287 = sum of:
      0.18752287 = product of:
        0.52089685 = sum of:
          0.013565819 = weight(abstract_txt:most in 2605) [ClassicSimilarity], result of:
            0.013565819 = score(doc=2605,freq=1.0), product of:
              0.05480051 = queryWeight, product of:
                1.0724792 = boost
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.012900731 = queryNorm
              0.24754913 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.017754609 = weight(abstract_txt:data in 2605) [ClassicSimilarity], result of:
            0.017754609 = score(doc=2605,freq=2.0), product of:
              0.059572857 = queryWeight, product of:
                1.3695139 = boost
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.012900731 = queryNorm
              0.29803184 = fieldWeight in 2605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.10008901 = weight(abstract_txt:algorithm in 2605) [ClassicSimilarity], result of:
            0.10008901 = score(doc=2605,freq=6.0), product of:
              0.11429514 = queryWeight, product of:
                1.548854 = boost
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.012900731 = queryNorm
              0.8757066 = fieldWeight in 2605, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.072172344 = weight(abstract_txt:algorithms in 2605) [ClassicSimilarity], result of:
            0.072172344 = score(doc=2605,freq=3.0), product of:
              0.115796134 = queryWeight, product of:
                1.5589911 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.012900731 = queryNorm
              0.62327075 = fieldWeight in 2605, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.14719534 = weight(abstract_txt:chinese in 2605) [ClassicSimilarity], result of:
            0.14719534 = score(doc=2605,freq=7.0), product of:
              0.14040545 = queryWeight, product of:
                1.716677 = boost
                6.3398805 = idf(docFreq=204, maxDocs=42740)
                0.012900731 = queryNorm
              1.0483592 = fieldWeight in 2605, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.3398805 = idf(docFreq=204, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.041321862 = weight(abstract_txt:large in 2605) [ClassicSimilarity], result of:
            0.041321862 = score(doc=2605,freq=2.0), product of:
              0.1046231 = queryWeight, product of:
                1.8149139 = boost
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.012900731 = queryNorm
              0.39495924 = fieldWeight in 2605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.03682299 = weight(abstract_txt:search in 2605) [ClassicSimilarity], result of:
            0.03682299 = score(doc=2605,freq=3.0), product of:
              0.09315429 = queryWeight, product of:
                1.9774842 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.012900731 = queryNorm
              0.39529032 = fieldWeight in 2605, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.06473164 = weight(abstract_txt:independent in 2605) [ClassicSimilarity], result of:
            0.06473164 = score(doc=2605,freq=1.0), product of:
              0.17779878 = queryWeight, product of:
                2.3659556 = boost
                5.82516 = idf(docFreq=342, maxDocs=42740)
                0.012900731 = queryNorm
              0.3640725 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.82516 = idf(docFreq=342, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.027243197 = weight(abstract_txt:retrieval in 2605) [ClassicSimilarity], result of:
            0.027243197 = score(doc=2605,freq=1.0), product of:
              0.12580553 = queryWeight, product of:
                2.8145378 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.012900731 = queryNorm
              0.21655008 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
        0.36 = coord(9/25)
    
  3. Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.17
    0.16646428 = sum of:
      0.16646428 = product of:
        0.5945153 = sum of:
          0.013565819 = weight(abstract_txt:most in 1800) [ClassicSimilarity], result of:
            0.013565819 = score(doc=1800,freq=1.0), product of:
              0.05480051 = queryWeight, product of:
                1.0724792 = boost
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.012900731 = queryNorm
              0.24754913 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.036652807 = weight(abstract_txt:engine in 1800) [ClassicSimilarity], result of:
            0.036652807 = score(doc=1800,freq=1.0), product of:
              0.10630624 = queryWeight, product of:
                1.4937433 = boost
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.012900731 = queryNorm
              0.3447851 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.040861163 = weight(abstract_txt:algorithm in 1800) [ClassicSimilarity], result of:
            0.040861163 = score(doc=1800,freq=1.0), product of:
              0.11429514 = queryWeight, product of:
                1.548854 = boost
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.012900731 = queryNorm
              0.3575057 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.072172344 = weight(abstract_txt:algorithms in 1800) [ClassicSimilarity], result of:
            0.072172344 = score(doc=1800,freq=3.0), product of:
              0.115796134 = queryWeight, product of:
                1.5589911 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.012900731 = queryNorm
              0.62327075 = fieldWeight in 1800, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.021259762 = weight(abstract_txt:search in 1800) [ClassicSimilarity], result of:
            0.021259762 = score(doc=1800,freq=1.0), product of:
              0.09315429 = queryWeight, product of:
                1.9774842 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.012900731 = queryNorm
              0.22822097 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.027243197 = weight(abstract_txt:retrieval in 1800) [ClassicSimilarity], result of:
            0.027243197 = score(doc=1800,freq=1.0), product of:
              0.12580553 = queryWeight, product of:
                2.8145378 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.012900731 = queryNorm
              0.21655008 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
          0.38276026 = weight(abstract_txt:pages in 1800) [ClassicSimilarity], result of:
            0.38276026 = score(doc=1800,freq=5.0), product of:
              0.49036482 = queryWeight, product of:
                6.8055387 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.012900731 = queryNorm
              0.7805622 = fieldWeight in 1800, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0625 = fieldNorm(doc=1800)
        0.28 = coord(7/25)
    
  4. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.16
    0.16431482 = sum of:
      0.16431482 = product of:
        0.58683866 = sum of:
          0.016957274 = weight(abstract_txt:most in 1219) [ClassicSimilarity], result of:
            0.016957274 = score(doc=1219,freq=4.0), product of:
              0.05480051 = queryWeight, product of:
                1.0724792 = boost
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.012900731 = queryNorm
              0.3094364 = fieldWeight in 1219, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.015446356 = weight(abstract_txt:useful in 1219) [ClassicSimilarity], result of:
            0.015446356 = score(doc=1219,freq=1.0), product of:
              0.08174313 = queryWeight, product of:
                1.309852 = boost
                4.8374305 = idf(docFreq=920, maxDocs=42740)
                0.012900731 = queryNorm
              0.18896213 = fieldWeight in 1219, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8374305 = idf(docFreq=920, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.03967783 = weight(abstract_txt:engine in 1219) [ClassicSimilarity], result of:
            0.03967783 = score(doc=1219,freq=3.0), product of:
              0.10630624 = queryWeight, product of:
                1.4937433 = boost
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.012900731 = queryNorm
              0.37324083 = fieldWeight in 1219, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.03611651 = weight(abstract_txt:algorithm in 1219) [ClassicSimilarity], result of:
            0.03611651 = score(doc=1219,freq=2.0), product of:
              0.11429514 = queryWeight, product of:
                1.548854 = boost
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.012900731 = queryNorm
              0.3159934 = fieldWeight in 1219, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7200913 = idf(docFreq=380, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.018261855 = weight(abstract_txt:large in 1219) [ClassicSimilarity], result of:
            0.018261855 = score(doc=1219,freq=1.0), product of:
              0.1046231 = queryWeight, product of:
                1.8149139 = boost
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.012900731 = queryNorm
              0.17454898 = fieldWeight in 1219, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.468454 = idf(docFreq=1331, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.046028737 = weight(abstract_txt:search in 1219) [ClassicSimilarity], result of:
            0.046028737 = score(doc=1219,freq=12.0), product of:
              0.09315429 = queryWeight, product of:
                1.9774842 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.012900731 = queryNorm
              0.4941129 = fieldWeight in 1219, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
          0.41435012 = weight(abstract_txt:pages in 1219) [ClassicSimilarity], result of:
            0.41435012 = score(doc=1219,freq=15.0), product of:
              0.49036482 = queryWeight, product of:
                6.8055387 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.012900731 = queryNorm
              0.8449834 = fieldWeight in 1219, product of:
                3.8729835 = tf(freq=15.0), with freq of:
                  15.0 = termFreq=15.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0390625 = fieldNorm(doc=1219)
        0.28 = coord(7/25)
    
  5. Lawrence, S.; Giles, C.L.: Inquirus, the NECI meta search engine (1998) 0.16
    0.16172616 = sum of:
      0.16172616 = product of:
        0.673859 = sum of:
          0.018308999 = weight(abstract_txt:both in 4605) [ClassicSimilarity], result of:
            0.018308999 = score(doc=4605,freq=1.0), product of:
              0.05107435 = queryWeight, product of:
                1.0353758 = boost
                3.8237588 = idf(docFreq=2537, maxDocs=42740)
                0.012900731 = queryNorm
              0.35847738 = fieldWeight in 4605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8237588 = idf(docFreq=2537, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
          0.03707125 = weight(abstract_txt:useful in 4605) [ClassicSimilarity], result of:
            0.03707125 = score(doc=4605,freq=1.0), product of:
              0.08174313 = queryWeight, product of:
                1.309852 = boost
                4.8374305 = idf(docFreq=920, maxDocs=42740)
                0.012900731 = queryNorm
              0.4535091 = fieldWeight in 4605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8374305 = idf(docFreq=920, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
          0.05497921 = weight(abstract_txt:engine in 4605) [ClassicSimilarity], result of:
            0.05497921 = score(doc=4605,freq=1.0), product of:
              0.10630624 = queryWeight, product of:
                1.4937433 = boost
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.012900731 = queryNorm
              0.51717764 = fieldWeight in 4605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
          0.073673576 = weight(abstract_txt:query in 4605) [ClassicSimilarity], result of:
            0.073673576 = score(doc=4605,freq=2.0), product of:
              0.11739637 = queryWeight, product of:
                1.9225142 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.012900731 = queryNorm
              0.62756264 = fieldWeight in 4605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
          0.045098767 = weight(abstract_txt:search in 4605) [ClassicSimilarity], result of:
            0.045098767 = score(doc=4605,freq=2.0), product of:
              0.09315429 = queryWeight, product of:
                1.9774842 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.012900731 = queryNorm
              0.4841298 = fieldWeight in 4605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
          0.44472718 = weight(abstract_txt:pages in 4605) [ClassicSimilarity], result of:
            0.44472718 = score(doc=4605,freq=3.0), product of:
              0.49036482 = queryWeight, product of:
                6.8055387 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.012900731 = queryNorm
              0.9069313 = fieldWeight in 4605, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.09375 = fieldNorm(doc=4605)
        0.24 = coord(6/25)