Search (2 results, page 1 of 1)

  • × author_ss:"Wilkinson, D."
  • × theme_ss:"Internet"
  1. Thelwall, M.; Wilkinson, D.: Finding similar academic Web sites with links, bibliometric couplings and colinks (2004) 0.02
    0.023531662 = product of:
      0.16472162 = sum of:
        0.16472162 = weight(_text_:sites in 2571) [ClassicSimilarity], result of:
          0.16472162 = score(doc=2571,freq=10.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.7749018 = fieldWeight in 2571, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.046875 = fieldNorm(doc=2571)
      0.14285715 = coord(1/7)
    
    Abstract
    A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.
  2. Thelwall, M.; Wilkinson, D.: Graph structure in three national academic Webs : power laws with anomalies (2003) 0.01
    0.01488273 = product of:
      0.10417911 = sum of:
        0.10417911 = weight(_text_:sites in 1681) [ClassicSimilarity], result of:
          0.10417911 = score(doc=1681,freq=4.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.49009097 = fieldWeight in 1681, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.046875 = fieldNorm(doc=1681)
      0.14285715 = coord(1/7)
    
    Abstract
    The graph structures of three national university publicly indexable Webs from Australia, New Zealand, and the UK were analyzed. Strong scale-free regularities for page indegrees, outdegrees, and connected component sizes were in evidence, resulting in power laws similar to those previously identified for individual university Web sites and for the AItaVista-indexed Web. Anomalies were also discovered in most distributions and were tracked down to root causes. As a result, resource driven Web sites and automatically generated pages were identified as representing a significant break from the assumptions of previous power law models. It follows that attempts to track average Web linking behavior would benefit from using techniques to minimize or eliminate the impact of such anomalies.