Search (1 results, page 1 of 1)

  • × author_ss:"Brin, S."
  • × theme_ss:"Internet"
  1. Brin, S.: Extracting patterns and relations from the World Wide Web (1999) 0.02
    0.02317223 = product of:
      0.10813707 = sum of:
        0.06362897 = weight(_text_:wide in 3970) [ClassicSimilarity], result of:
          0.06362897 = score(doc=3970,freq=4.0), product of:
            0.1312982 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.029633347 = queryNorm
            0.4846142 = fieldWeight in 3970, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3970)
        0.034519844 = weight(_text_:web in 3970) [ClassicSimilarity], result of:
          0.034519844 = score(doc=3970,freq=4.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.35694647 = fieldWeight in 3970, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3970)
        0.009988253 = weight(_text_:information in 3970) [ClassicSimilarity], result of:
          0.009988253 = score(doc=3970,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1920054 = fieldWeight in 3970, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3970)
      0.21428572 = coord(3/14)
    
    Abstract
    The WWW is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many different formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author, title) pairs from the WWW
    Source
    The World Wide Web and Databases: International Workshop WebDB'98, Valencia, Spain, March 27-28, 1998, Selected papers. Eds.: P. Atzeni et al