Search (50 results, page 1 of 3)

Kousha, K.; Thelwall, M.: Google Scholar citations and Google Web/URL citations : a multi-discipline exploratory analysis (2007) 0.08
```
0.08258697 = product of:
  0.123880446 = sum of:
    0.11726358 = weight(_text_:sociology in 337) [ClassicSimilarity], result of:
      0.11726358 = score(doc=337,freq=2.0), product of:
        0.30495512 = queryWeight, product of:
          6.9606886 = idf(docFreq=113, maxDocs=44218)
          0.043811057 = queryNorm
        0.38452733 = fieldWeight in 337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.9606886 = idf(docFreq=113, maxDocs=44218)
          0.0390625 = fieldNorm(doc=337)
    0.006616868 = product of:
      0.013233736 = sum of:
        0.013233736 = weight(_text_:of in 337) [ClassicSimilarity], result of:
          0.013233736 = score(doc=337,freq=10.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.19316542 = fieldWeight in 337, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

We use a new data gathering method, "Web/URL citation," Web/URL and Google Scholar to compare traditional and Web-based citation patterns across multiple disciplines (biology, chemistry, physics, computing, sociology, economics, psychology, and education) based upon a sample of 1,650 articles from 108 open access (OA) journals published in 2001. A Web/URL citation of an online journal article is a Web mention of its title, URL, or both. For each discipline, except psychology, we found significant correlations between Thomson Scientific (formerly Thomson ISI, here: ISI) citations and both Google Scholar and Google Web/URL citations. Google Scholar citations correlated more highly with ISI citations than did Google Web/URL citations, indicating that the Web/URL method measures a broader type of citation phenomenon. Google Scholar citations were more numerous than ISI citations in computer science and the four social science disciplines, suggesting that Google Scholar is more comprehensive for social sciences and perhaps also when conference articles are valued and published online. We also found large disciplinary differences in the percentage overlap between ISI and Google Scholar citation sources. Finally, although we found many significant trends, there were also numerous exceptions, suggesting that replacing traditional citation sources with the Web or Google Scholar for research impact calculations would be problematic.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.7, S.1055-1065
Levitt, J.M.; Thelwall, M.: Citation levels and collaboration within library and information science (2009) 0.02
```
0.019570632 = product of:
  0.058711894 = sum of:
    0.058711894 = sum of:
      0.016739499 = weight(_text_:of in 2734) [ClassicSimilarity], result of:
        0.016739499 = score(doc=2734,freq=16.0), product of:
          0.06850986 = queryWeight, product of:
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.043811057 = queryNorm
          0.24433708 = fieldWeight in 2734, product of:
            4.0 = tf(freq=16.0), with freq of:
              16.0 = termFreq=16.0
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2734)
      0.041972395 = weight(_text_:22 in 2734) [ClassicSimilarity], result of:
        0.041972395 = score(doc=2734,freq=4.0), product of:
          0.15341885 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.043811057 = queryNorm
          0.27358043 = fieldWeight in 2734, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2734)
  0.33333334 = coord(1/3)
```
Abstract

Collaboration is a major research policy objective, but does it deliver higher quality research? This study uses citation analysis to examine the Web of Science (WoS) Information Science & Library Science subject category (IS&LS) to ascertain whether, in general, more highly cited articles are more highly collaborative than other articles. It consists of two investigations. The first investigation is a longitudinal comparison of the degree and proportion of collaboration in five strata of citation; it found that collaboration in the highest four citation strata (all in the most highly cited 22%) increased in unison over time, whereas collaboration in the lowest citation strata (un-cited articles) remained low and stable. Given that over 40% of the articles were un-cited, it seems important to take into account the differences found between un-cited articles and relatively highly cited articles when investigating collaboration in IS&LS. The second investigation compares collaboration for 35 influential information scientists; it found that their more highly cited articles on average were not more highly collaborative than their less highly cited articles. In summary, although collaborative research is conducive to high citation in general, collaboration has apparently not tended to be essential to the success of current and former elite information scientists.

Date

22. 3.2009 12:43:51

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.3, S.434-442
Kousha, K.; Thelwall, M.: How is science cited on the Web? : a classification of google unique Web citations (2007) 0.02
```
0.01700591 = product of:
  0.051017724 = sum of:
    0.051017724 = sum of:
      0.021338759 = weight(_text_:of in 586) [ClassicSimilarity], result of:
        0.021338759 = score(doc=586,freq=26.0), product of:
          0.06850986 = queryWeight, product of:
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.043811057 = queryNorm
          0.31146988 = fieldWeight in 586, product of:
            5.0990195 = tf(freq=26.0), with freq of:
              26.0 = termFreq=26.0
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.0390625 = fieldNorm(doc=586)
      0.029678967 = weight(_text_:22 in 586) [ClassicSimilarity], result of:
        0.029678967 = score(doc=586,freq=2.0), product of:
          0.15341885 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.043811057 = queryNorm
          0.19345059 = fieldWeight in 586, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=586)
  0.33333334 = coord(1/3)
```
Abstract

Although the analysis of citations in the scholarly literature is now an established and relatively well understood part of information science, not enough is known about citations that can be found on the Web. In particular, are there new Web types, and if so, are these trivial or potentially useful for studying or evaluating research communication? We sought evidence based upon a sample of 1,577 Web citations of the URLs or titles of research articles in 64 open-access journals from biology, physics, chemistry, and computing. Only 25% represented intellectual impact, from references of Web documents (23%) and other informal scholarly sources (2%). Many of the Web/URL citations were created for general or subject-specific navigation (45%) or for self-publicity (22%). Additional analyses revealed significant disciplinary differences in the types of Google unique Web/URL citations as well as some characteristics of scientific open-access publishing on the Web. We conclude that the Web provides access to a new and different type of citation information, one that may therefore enable us to measure different aspects of research, and the research process in particular; but to obtain good information, the different types should be separated.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.11, S.1631-1644
Thelwall, M.; Harries, G.: ¬The connection between the research of a university and counts of links to its Web pages : an investigation based upon a classification of the relationships of pages to the research of the host university (2003) 0.00
```
0.0045800544 = product of:
  0.013740162 = sum of:
    0.013740162 = product of:
      0.027480325 = sum of:
        0.027480325 = weight(_text_:of in 1676) [ClassicSimilarity], result of:
          0.027480325 = score(doc=1676,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.40111488 = fieldWeight in 1676, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1676)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Results from recent advances in link metrics have demonstrated that the hyperlink structure of national university systems can be strongly related to the research productivity of the individual institutions. This paper uses a page categorization to show that restricting the metrics to subsets more closely related to the research of the host university can produce even stronger associations. A partial overlap was also found between the effects of applying advanced document models and separating page types, but the best results were achieved through a combination of the two.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.594-602
Thelwall, M.; Wilkinson, D.: Finding similar academic Web sites with links, bibliometric couplings and colinks (2004) 0.00
```
0.004267752 = product of:
  0.012803256 = sum of:
    0.012803256 = product of:
      0.025606511 = sum of:
        0.025606511 = weight(_text_:of in 2571) [ClassicSimilarity], result of:
          0.025606511 = score(doc=2571,freq=26.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.37376386 = fieldWeight in 2571, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2571)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.
Thelwall, M.: ¬A layered approach for investigating the topological structure of communities in the Web (2003) 0.00
```
0.003945538 = product of:
  0.0118366135 = sum of:
    0.0118366135 = product of:
      0.023673227 = sum of:
        0.023673227 = weight(_text_:of in 4450) [ClassicSimilarity], result of:
          0.023673227 = score(doc=4450,freq=32.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34554482 = fieldWeight in 4450, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4450)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A layered approach for identifying communities in the Web is presented and explored by applying the flake exact community identification algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: the application of alternative document models, shifting the focus from individual pages to aggregated collections based upon Web directories, domains and entire sites; the removal of internal site links; and the adaptation of a new fast algorithm to allow fully-automated community identification using all possible single starting points. The overall topology of the graphs in the three least-aggregated layers was first investigated and found to include a large number of isolated points but, surprisingly, with most of the remainder being in one huge connected component, exact proportions varying by layer. The community identification process then found that the number of communities far exceeded the number of topological components, indicating that community identification is a potentially useful technique, even with random starting points. Both the number and size of communities identified was dependent on the parameter of the algorithm, with very different results being obtained in each case. In conclusion, the UK academic Web is embedded with layers of non-trivial communities and, if it is not unique in this, then there is the promise of improved results for information retrieval algorithms that can exploit this additional structure, and the application of the technique directly to partially automate Web metrics tasks such as that of finding all pages related to a given subject hosted by a single country's universities.

Source

Journal of documentation. 59(2003) no.4, S.410-429
Thelwall, M.: Interpreting social science link analysis research : a theoretical framework (2006) 0.00
```
0.003925761 = product of:
  0.011777283 = sum of:
    0.011777283 = product of:
      0.023554565 = sum of:
        0.023554565 = weight(_text_:of in 4908) [ClassicSimilarity], result of:
          0.023554565 = score(doc=4908,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34381276 = fieldWeight in 4908, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4908)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Link analysis in various forms is now an established technique in many different subjects, reflecting the perceived importance of links and of the Web. A critical but very difficult issue is how to interpret the results of social science link analyses. lt is argued that the dynamic nature of the Web, its lack of quality control, and the online proliferation of copying and imitation mean that methodologies operating within a highly positivist, quantitative framework are ineffective. Conversely, the sheer variety of the Web makes application of qualitative methodologies and pure reason very problematic to large-scale studies. Methodology triangulation is consequently advocated, in combination with a warning that the Web is incapable of giving definitive answers to large-scale link analysis research questions concerning social factors underlying link creation. Finally, it is claimed that although theoretical frameworks are appropriate for guiding research, a Theory of Link Analysis is not possible.

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.1, S.60-68
Thelwall, M.: Webometrics (2009) 0.00
```
0.003925761 = product of:
  0.011777283 = sum of:
    0.011777283 = product of:
      0.023554565 = sum of:
        0.023554565 = weight(_text_:of in 3906) [ClassicSimilarity], result of:
          0.023554565 = score(doc=3906,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34381276 = fieldWeight in 3906, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3906)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Webometrics is an information science field concerned with measuring aspects of the World Wide Web (WWW) for a variety of information science research goals. It came into existence about five years after the Web was formed and has since grown to become a significant aspect of information science, at least in terms of published research. Although some webometrics research has focused on the structure or evolution of the Web itself or the performance of commercial search engines, most has used data from the Web to shed light on information provision or online communication in various contexts. Most prominently, techniques have been developed to track, map, and assess Web-based informal scholarly communication, for example, in terms of the hyperlinks between academic Web sites or the online impact of digital repositories. In addition, a range of nonacademic issues and groups of Web users have also been analyzed.

Source

Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
Thelwall, M.: Bibliometrics to webometrics (2009) 0.00
```
0.0039058835 = product of:
  0.01171765 = sum of:
    0.01171765 = product of:
      0.0234353 = sum of:
        0.0234353 = weight(_text_:of in 4239) [ClassicSimilarity], result of:
          0.0234353 = score(doc=4239,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34207192 = fieldWeight in 4239, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4239)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Bibliometrics has changed out of all recognition since 1958; becoming established as a field, being taught widely in library and information science schools, and being at the core of a number of science evaluation research groups around the world. This was all made possible by the work of Eugene Garfield and his Science Citation Index. This article reviews the distance that bibliometrics has travelled since 1958 by comparing early bibliometrics with current practice, and by giving an overview of a range of recent developments, such as patent analysis, national research evaluation exercises, visualization techniques, new applications, online citation indexes, and the creation of digital libraries. Webometrics, a modern, fast-growing offshoot of bibliometrics, is reviewed in detail. Finally, future prospects are discussed with regard to both bibliometrics and webometrics.
Thelwall, M.: Results from a web impact factor crawler (2001) 0.00
```
0.0038202507 = product of:
  0.011460752 = sum of:
    0.011460752 = product of:
      0.022921504 = sum of:
        0.022921504 = weight(_text_:of in 4490) [ClassicSimilarity], result of:
          0.022921504 = score(doc=4490,freq=30.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.33457235 = fieldWeight in 4490, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4490)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Web impact factors, the proposed web equivalent of impact factors for journals, can be calculated by using search engines. It has been found that the results are problematic because of the variable coverage of search engines as well as their ability to give significantly different results over short periods of time. The fundamental problem is that although some search engines provide a functionality that is capable of being used for impact calculations, this is not their primary task and therefore they do not give guarantees as to performance in this respect. In this paper, a bespoke web crawler designed specifically for the calculation of reliable WIFs is presented. This crawler was used to calculate WIFs for a number of UK universities, and the results of these calculations are discussed. The principal findings were that with certain restrictions, WIFs can be calculated reliably, but do not correlate with accepted research rankings owing to the variety of material hosted on university servers. Changes to the calculations to improve the fit of the results to research rankings are proposed, but there are still inherent problems undermining the reliability of the calculation. These problems still apply if the WIF scores are taken on their own as indicators of the general impact of any area of the Internet, but with care would not apply to online journals.

Source

Journal of documentation. 57(2001) no.2, S.177-191
Thelwall, M.: ¬A comparison of sources of links for academic Web impact factor calculations (2002) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 4474) [ClassicSimilarity], result of:
          0.022458395 = score(doc=4474,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 4474, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4474)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

There has been much recent interest in extracting information from collections of Web links. One tool that has been used is Ingwersen's Web impact factor. It has been demonstrated that several versions of this metric can produce results that correlate with research ratings of British universities showing that, despite being a measure of a purely Internet phenomenon, the results are susceptible to a wider interpretation. This paper addresses the question of which is the best possible domain to count backlinks from, if research is the focus of interest. WIFs for British universities calculated from several different source domains are compared, primarily the .edu, .ac.uk and .uk domains, and the entire Web. The results show that all four areas produce WIFs that correlate strongly with research ratings, but that none produce incontestably superior figures. It was also found that the WIF was less able to differentiate in more homogeneous subsets of universities, although positive results are still possible.

Source

Journal of documentation. 58(2002) no.1, S.66-78
Thelwall, M.; Prabowo, R.: Identifying and characterizing public science-related fears from RSS feeds (2007) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 137) [ClassicSimilarity], result of:
          0.022458395 = score(doc=137,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 137, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=137)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A feature of modern democracies is public mistrust of scientists and the politicization of science policy, e.g., concerning stem cell research and genetically modified food. While the extent of this mistrust is debatable, its political influence is tangible. Hence, science policy researchers and science policy makers need early warning of issues that resonate with a wide public so that they can make timely and informed decisions. In this article, a semi-automatic method for identifying significant public science-related concerns from a corpus of Internet-based RSS (Really Simple Syndication) feeds is described and shown to be an improvement on a previous similar system because of the introduction of feedbased aggregation. In addition, both the RSS corpus and the concept of public science-related fears are deconstructed, revealing hidden complexity. This article also provides evidence that genetically modified organisms and stem cell research were the two major policyrelevant science concern issues, although mobile phone radiation and software security also generated significant interest.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.379-390
Angus, E.; Thelwall, M.; Stuart, D.: General patterns of tag usage among university groups in Flickr (2008) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 2554) [ClassicSimilarity], result of:
          0.022458395 = score(doc=2554,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 2554, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2554)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Purpose - The purpose of this research is to investigate general patterns of tag usage and determines the usefulness of the tags used within university image groups to the wider Flickr community. There has been a significant rise in the use of Web 2.0 social network web sites and online applications in recent years. One of the most popular is Flickr, an online image management application. Design/methodology/approach - This study uses a webometric data collection, classification and informetric analysis. Findings - The results show that members of university image groups tend to tag in a manner that is of use to users of the system as a whole rather than merely for the tag creator. Originality/value - This paper gives a valuable insight into the tagging practices of image groups in Flickr.
Thelwall, M.: Homophily in MySpace (2009) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 2706) [ClassicSimilarity], result of:
          0.022458395 = score(doc=2706,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 2706, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2706)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Social network sites like MySpace are increasingly important environments for expressing and maintaining interpersonal connections, but does online communication exacerbate or ameliorate the known tendency for offline friendships to form between similar people (homophily)? This article reports an exploratory study of the similarity between the reported attributes of pairs of active MySpace Friends based upon a systematic sample of 2,567 members joining on June 18, 2007 and Friends who commented on their profile. The results showed no evidence of gender homophily but significant evidence of homophily for ethnicity, religion, age, country, marital status, attitude towards children, sexual orientation, and reason for joining MySpace. There were also some imbalances: women and the young were disproportionately commenters, and commenters tended to have more Friends than commentees. Overall, it seems that although traditional sources of homophily are thriving in MySpace networks of active public connections, gender homophily has completely disappeared. Finally, the method used has wide potential for investigating and partially tracking homophily in society, providing early warning of socially divisive trends.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.2, S.219-231
Thelwall, M.: Extracting macroscopic information from Web links (2001) 0.00
```
0.0036907129 = product of:
  0.011072138 = sum of:
    0.011072138 = product of:
      0.022144277 = sum of:
        0.022144277 = weight(_text_:of in 6851) [ClassicSimilarity], result of:
          0.022144277 = score(doc=6851,freq=28.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32322758 = fieldWeight in 6851, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6851)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Much has been written about the potential and pitfalls of macroscopic Web-based link analysis, yet there have been no studies that have provided clear statistical evidence that any of the proposed calculations can produce results over large areas of the Web that correlate with phenomena external to the Internet. This article attempts to provide such evidence through an evaluation of Ingwersen's (1998) proposed external Web Impact Factor (WIF) for the original use of the Web: the interlinking of academic research. In particular, it studies the case of the relationship between academic hyperlinks and research activity for universities in Britain, a country chosen for its variety of institutions and the existence of an official government rating exercise for research. After reviewing the numerous reasons why link counts may be unreliable, it demonstrates that four different WIFs do, in fact, correlate with the conventional academic research measures. The WIF delivering the greatest correlation with research rankings was the ratio of Web pages with links pointing at research-based pages to faculty numbers. The scarcity of links to electronic academic papers in the data set suggests that, in contrast to citation analysis, this WIF is measuring the reputations of universities and their scholars, rather than the quality of their publications

Source

Journal of the American Society for Information Science and technology. 52(2001) no.13, S.1157-1168
Thelwall, M.; Harries, G.: Do the Web Sites of Higher Rated Scholars Have Significantly More Online Impact? (2004) 0.00
```
0.0036907129 = product of:
  0.011072138 = sum of:
    0.011072138 = product of:
      0.022144277 = sum of:
        0.022144277 = weight(_text_:of in 2123) [ClassicSimilarity], result of:
          0.022144277 = score(doc=2123,freq=28.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32322758 = fieldWeight in 2123, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The quality and impact of academic Web sites is of interest to many audiences, including the scholars who use them and Web educators who need to identify best practice. Several large-scale European Union research projects have been funded to build new indicators for online scientific activity, reflecting recognition of the importance of the Web for scholarly communication. In this paper we address the key question of whether higher rated scholars produce higher impact Web sites, using the United Kingdom as a case study and measuring scholars' quality in terms of university-wide average research ratings. Methodological issues concerning the measurement of the online impact are discussed, leading to the adoption of counts of links to a university's constituent single domain Web sites from an aggregated counting metric. The findings suggest that universities with higher rated scholars produce significantly more Web content but with a similar average online impact. Higher rated scholars therefore attract more total links from their peers, but only by being more prolific, refuting earlier suggestions. It can be surmised that general Web publications are very different from scholarly journal articles and conference papers, for which scholarly quality does associate with citation impact. This has important implications for the construction of new Web indicators, for example that online impact should not be used to assess the quality of small groups of scholars, even within a single discipline.

Source

Journal of the American Society for Information Science and technology. 55(2004) no.2, S.149-159
Thelwall, M.; Stuart, D.: Web crawling ethics revisited : cost, privacy, and denial of service (2006) 0.00
```
0.0036536194 = product of:
  0.010960858 = sum of:
    0.010960858 = product of:
      0.021921717 = sum of:
        0.021921717 = weight(_text_:of in 6098) [ClassicSimilarity], result of:
          0.021921717 = score(doc=6098,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.31997898 = fieldWeight in 6098, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6098)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Ethical aspects of the employment of Web crawlers for information science research and other contexts are reviewed. The difference between legal and ethical uses of communications technologies is emphasized as well as the changing boundary between ethical and unethical conduct. A review of the potential impacts on Web site owners is used to underpin a new framework for ethical crawling, and it is argued that delicate human judgment is required for each individual case, with verdicts likely to change over time. Decisions can be based upon an approximate cost-benefit analysis, but it is crucial that crawler owners find out about the technological issues affecting the owners of the sites being crawled in order to produce an informed assessment.

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.13, S.1771-1779
Barjak, F.; Thelwall, M.: ¬A statistical analysis of the web presences of European life sciences research teams (2008) 0.00
```
0.00355646 = product of:
  0.0106693795 = sum of:
    0.0106693795 = product of:
      0.021338759 = sum of:
        0.021338759 = weight(_text_:of in 1383) [ClassicSimilarity], result of:
          0.021338759 = score(doc=1383,freq=26.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.31146988 = fieldWeight in 1383, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1383)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Web links have been used for around ten years to explore the online impact of academic information and information producers. Nevertheless, few studies have attempted to relate link counts to relevant offline attributes of the owners of the targeted Web sites, with the exception of research productivity. This article reports the results of a study to relate site inlink counts to relevant owner characteristics for over 400 European life-science research group Web sites. The analysis confirmed that research-group size and Web-presence size were important for attracting Web links, although research productivity was not. Little evidence was found for significant influence of any of an array of factors, including research-group leader gender and industry connections. In addition, the choice of search engine for link data created a surprising international difference in the results, with Google perhaps giving unreliable results. Overall, the data collection, statistical analysis and results interpretation were all complex and it seems that we still need to know more about search engines, hyperlinks, and their function in science before we can draw conclusions on their usefulness and role in the canon of science and technology indicators.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.4, S.628-643
Thelwall, M.: Quantitative comparisons of search engine results (2008) 0.00
```
0.00355646 = product of:
  0.0106693795 = sum of:
    0.0106693795 = product of:
      0.021338759 = sum of:
        0.021338759 = weight(_text_:of in 2350) [ClassicSimilarity], result of:
          0.021338759 = score(doc=2350,freq=26.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.31146988 = fieldWeight in 2350, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2350)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5-6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.11, S.1702-1710
Vaughan, L.; Thelwall, M.: Search engine coverage bias : evidence and possible causes (2004) 0.00
```
0.0035509837 = product of:
  0.010652951 = sum of:
    0.010652951 = product of:
      0.021305902 = sum of:
        0.021305902 = weight(_text_:of in 2536) [ClassicSimilarity], result of:
          0.021305902 = score(doc=2536,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.3109903 = fieldWeight in 2536, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2536)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites. The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern.

Search (50 results, page 1 of 3)

Authors

Themes