Search (6 results, page 1 of 1)

  • × author_ss:"Thelwall, M."
  • × author_ss:"Vaughan, L."
  1. Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.03
    0.026009807 = product of:
      0.052019615 = sum of:
        0.052019615 = product of:
          0.10403923 = sum of:
            0.10403923 = weight(_text_:web in 674) [ClassicSimilarity], result of:
              0.10403923 = score(doc=674,freq=16.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.6119082 = fieldWeight in 674, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=674)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
  2. Vaughan, L.; Thelwall, M.: Scholarly use of the Web : what are the key inducers of links to journal Web sites? (2003) 0.02
    0.022989638 = product of:
      0.045979276 = sum of:
        0.045979276 = product of:
          0.09195855 = sum of:
            0.09195855 = weight(_text_:web in 1236) [ClassicSimilarity], result of:
              0.09195855 = score(doc=1236,freq=18.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.5408555 = fieldWeight in 1236, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1236)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Web links have been studied by information scientists for at least six years but it is only in the past two that clear evidence has emerged to show that counts of links to scholarly Web spaces (universities and departments) can correlate significantly with research measures, giving some credence to their use for the investigation of scholarly communication. This paper reports an a study to investigate the factors that influence the creation of links to journal Web sites. An empirical approach is used: collecting data and testing for significant patterns. The specific questions addressed are whether site age and site content are inducers of links to a journal's Web site as measured by the ratio of link counts to Journal Impact Factors, two variables previously discovered to be related. A new methodology for data collection is also introduced that uses the Internet Archive to obtain an earliest known creation date for Web sites. The results show that both site age and site content are significant factors for the disciplines studied: library and information science, and law. Comparisons between the two fields also show disciplinary differences in Web site characteristics. Scholars and publishers should be particularly aware that richer content an a journal's Web site tends to generate links and thus the traffic to the site.
  3. Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.02
    0.022989638 = product of:
      0.045979276 = sum of:
        0.045979276 = product of:
          0.09195855 = sum of:
            0.09195855 = weight(_text_:web in 4279) [ClassicSimilarity], result of:
              0.09195855 = score(doc=4279,freq=18.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.5408555 = fieldWeight in 4279, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4279)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
  4. Vaughan, L.; Thelwall, M.: ¬A modelling approach to uncover hyperlink patterns : the case of Canadian universities (2005) 0.02
    0.018582305 = product of:
      0.03716461 = sum of:
        0.03716461 = product of:
          0.07432922 = sum of:
            0.07432922 = weight(_text_:web in 1014) [ClassicSimilarity], result of:
              0.07432922 = score(doc=1014,freq=6.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.43716836 = fieldWeight in 1014, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1014)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Hyperlink patterns between Canadian university Web sites were analyzed by a mathematical modeling approach. A multiple regression model was developed which shows that faculty quality and the language of the university are important predictors for links to a university Web site. Higher faculty quality means more links. French universities received lower numbers of links to their Web sites than comparable English universities. Analysis of interlinking between pairs of universities also showed that English universities are advantaged. Universities are more likely to link to each other when the geographical distance between them is less than 3000 km, possibly reflecting the east vs. west divide that exists in Canadian society.
  5. Vaughan, L.; Thelwall, M.: Search engine coverage bias : evidence and possible causes (2004) 0.02
    0.015927691 = product of:
      0.031855382 = sum of:
        0.031855382 = product of:
          0.063710764 = sum of:
            0.063710764 = weight(_text_:web in 2536) [ClassicSimilarity], result of:
              0.063710764 = score(doc=2536,freq=6.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.37471575 = fieldWeight in 2536, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2536)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites. The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern.
  6. Thelwall, M.; Vaughan, L.: Webometrics : an introduction to the special issue (2004) 0.01
    0.012261141 = product of:
      0.024522282 = sum of:
        0.024522282 = product of:
          0.049044564 = sum of:
            0.049044564 = weight(_text_:web in 2908) [ClassicSimilarity], result of:
              0.049044564 = score(doc=2908,freq=2.0), product of:
                0.17002425 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.052098576 = queryNorm
                0.2884563 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Webometrics, the quantitative study of Web phenomena, is a field encompassing contributions from information science, computer science, and statistical physics. Its methodology draws especially from bibliometrics. This special issue presents contributions that both push for ward the field and illustrate a wide range of webometric approaches.