Search (2 results, page 1 of 1)

Lawrence, S.; Giles, C.L.: Accessibility and distribution of information on the Web (1999) 0.00
```
0.0025239778 = product of:
  0.010095911 = sum of:
    0.010095911 = weight(_text_:information in 4952) [ClassicSimilarity], result of:
      0.010095911 = score(doc=4952,freq=4.0), product of:
        0.06134496 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.034944877 = queryNorm
        0.16457605 = fieldWeight in 4952, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4952)
  0.25 = coord(1/4)
```
Abstract

Search engine coverage relative to the estimated size of the publicly indexable web has decreased substantially since December 97, with no engine indexing more than about 16% of the estimated size of the publicly indexable web. (Note that many queries can be satisfied with a relatively small database). Search engines are typically more likely to index sites that have more links to them (more 'popular' sites). They are also typically more likely to index US sites than non-US sites (AltaVista is an exception), and more likely to index commercial sites than educational sites. Indexing of new or modified pages byjust one of the major search engines can take months. 83% of sites contain commercial content and 6% contain scientific or educational content. Only 1.5% of sites contain pornographic content. The publicly indexable web contains an estimated 800 million pages as of February 1999, encompassing about 15 terabytes of information or about 6 terabytes of text after removing HTML tags, comments, and extra whitespace. The simple HTML "keywords" and "description" metatags are only used on the homepages of 34% of sites. Only 0.3% of sites use the Dublin Core metadata standard.
Lawrence, S.; Giles, C.L.: Inquirus, the NECI meta search engine (1998) 0.00
```
0.0020821756 = product of:
  0.008328702 = sum of:
    0.008328702 = weight(_text_:information in 3604) [ClassicSimilarity], result of:
      0.008328702 = score(doc=3604,freq=2.0), product of:
        0.06134496 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.034944877 = queryNorm
        0.13576832 = fieldWeight in 3604, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3604)
  0.25 = coord(1/4)
```
Abstract

Presents Inquirus, a WWW meta search engine which works by downloading and analysing the individual documents. It makes improvements over existing search engines in a number of areas: more useful document summaries incorporating query term context, identification of both pages which no longer exist and pages which no longer contain the query terms, advanced detection of duplicate pages, improved document ranking using proximity information, dramatically improved precision for certain queries by using specific expressive forms, and quick jump links and highlighting when viewing the full document