Search (1 results, page 1 of 1)

  • × author_ss:"Cothey, V."
  • × theme_ss:"Informetrie"
  1. Cothey, V.: Web-crawling reliability (2004) 0.02
    0.017675493 = product of:
      0.035350986 = sum of:
        0.035350986 = product of:
          0.07070197 = sum of:
            0.07070197 = weight(_text_:i in 3089) [ClassicSimilarity], result of:
              0.07070197 = score(doc=3089,freq=4.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.41253293 = fieldWeight in 3089, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3089)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, I investigate the reliability, in the social science sense, of collecting informetric data about the World Wide Web by Web crawling. The investigation includes a critical examination of the practice of Web crawling and contrasts the results of content crawling with the results of link crawling. It is shown that Web crawling by search engines is intentionally biased and selective. I also report the results of a [arge-scale experimental simulation of Web crawling that illustrates the effects of different crawling policies an data collection. It is concluded that the reliability of Web crawling as a data collection technique is improved by fuller reporting of relevant crawling policies.