Search (24 results, page 1 of 2)

  • × year_i:[2000 TO 2010}
  • × author_ss:"Thelwall, M."
  1. Zuccala, A.; Thelwall, M.; Oppenheim, C.; Dhiensa, R.: Web intelligence analyses of digital libraries : a case study of the National electronic Library for Health (NeLH) (2007) 0.04
    0.038431544 = product of:
      0.102484114 = sum of:
        0.026727835 = weight(_text_:libraries in 838) [ClassicSimilarity], result of:
          0.026727835 = score(doc=838,freq=4.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.2053159 = fieldWeight in 838, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03125 = fieldNorm(doc=838)
        0.047871374 = weight(_text_:case in 838) [ClassicSimilarity], result of:
          0.047871374 = score(doc=838,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.2747759 = fieldWeight in 838, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03125 = fieldNorm(doc=838)
        0.027884906 = weight(_text_:studies in 838) [ClassicSimilarity], result of:
          0.027884906 = score(doc=838,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.17634688 = fieldWeight in 838, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03125 = fieldNorm(doc=838)
      0.375 = coord(3/8)
    
    Abstract
    Purpose - The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH). Design/methodology/approach - The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data. Findings - Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are "surfing" a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non-electronic sources. Originality/value - A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real-time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.
  2. Thelwall, M.: Extracting macroscopic information from Web links (2001) 0.02
    0.022901682 = product of:
      0.09160673 = sum of:
        0.042312715 = weight(_text_:case in 6851) [ClassicSimilarity], result of:
          0.042312715 = score(doc=6851,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 6851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6851)
        0.049294014 = weight(_text_:studies in 6851) [ClassicSimilarity], result of:
          0.049294014 = score(doc=6851,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.3117402 = fieldWeight in 6851, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6851)
      0.25 = coord(2/8)
    
    Abstract
    Much has been written about the potential and pitfalls of macroscopic Web-based link analysis, yet there have been no studies that have provided clear statistical evidence that any of the proposed calculations can produce results over large areas of the Web that correlate with phenomena external to the Internet. This article attempts to provide such evidence through an evaluation of Ingwersen's (1998) proposed external Web Impact Factor (WIF) for the original use of the Web: the interlinking of academic research. In particular, it studies the case of the relationship between academic hyperlinks and research activity for universities in Britain, a country chosen for its variety of institutions and the existence of an official government rating exercise for research. After reviewing the numerous reasons why link counts may be unreliable, it demonstrates that four different WIFs do, in fact, correlate with the conventional academic research measures. The WIF delivering the greatest correlation with research rankings was the ratio of Web pages with links pointing at research-based pages to faculty numbers. The scarcity of links to electronic academic papers in the data set suggests that, in contrast to citation analysis, this WIF is measuring the reputations of universities and their scholars, rather than the quality of their publications
  3. Thelwall, M.: Directing students to new information types : a new role for Google in literature searches? (2005) 0.02
    0.020994222 = product of:
      0.08397689 = sum of:
        0.04677371 = weight(_text_:libraries in 364) [ClassicSimilarity], result of:
          0.04677371 = score(doc=364,freq=4.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.35930282 = fieldWeight in 364, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=364)
        0.037203178 = product of:
          0.074406356 = sum of:
            0.074406356 = weight(_text_:area in 364) [ClassicSimilarity], result of:
              0.074406356 = score(doc=364,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.38107216 = fieldWeight in 364, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=364)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Conducting a literature review is an important activity for postgraduates and many undergraduates. Librarians can play an important role, directing students to digital libraries, compiling online subject reSource lists, and educating about the need to evaluate the quality of online resources. In order to conduct an effective literature search in a new area, however, in some subjects it is necessary to gain basic topic knowledge, including specialist vocabularies. Google's link-based page ranking algorithm makes this search engine an ideal tool for finding specialist topic introductory material, particularly in computer science, and so librarians should be teaching this as part of a strategic literature review approach.
    Source
    Libraries and Google. Eds.: Miller, W. u. R.M. Pellen
  4. Thelwall, M.; Prabowo, R.; Fairclough, R.: Are raw RSS feeds suitable for broad issue scanning? : a science concern case study (2006) 0.01
    0.007479902 = product of:
      0.059839215 = sum of:
        0.059839215 = weight(_text_:case in 6116) [ClassicSimilarity], result of:
          0.059839215 = score(doc=6116,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34346986 = fieldWeight in 6116, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6116)
      0.125 = coord(1/8)
    
    Abstract
    Broad issue scanning is the task of identifying important public debates arising in a given broad issue; really simple syndication (RSS) feeds are a natural information source for investigating broad issues. RSS, as originally conceived, is a method for publishing timely and concise information on the Internet, for example, about the main stories in a news site or the latest postings in a blog. RSS feeds are potentially a nonintrusive source of high-quality data about public opinion: Monitoring a large number may allow quantitative methods to extract information relevant to a given need. In this article we describe an RSS feed-based coword frequency method to identify bursts of discussion relevant to a given broad issue. A case study of public science concerns is used to demonstrate the method and assess the suitability of raw RSS feeds for broad issue scanning (i.e., without data cleansing). An attempt to identify genuine science concern debates from the corpus through investigating the top 1,000 "burst" words found only two genuine debates, however. The low success rate was mainly caused by a few pathological feeds that dominated the results and obscured any significant debates. The results point to the need to develop effective data cleansing procedures for RSS feeds, particularly if there is not a large quantity of discussion about the broad issue, and a range of potential techniques is suggested. Finally, the analysis confirmed that the time series information generated by real-time monitoring of RSS feeds could usefully illustrate the evolution of new debates relevant to a broad issue.
  5. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.01
    0.007479902 = product of:
      0.059839215 = sum of:
        0.059839215 = weight(_text_:case in 1401) [ClassicSimilarity], result of:
          0.059839215 = score(doc=1401,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34346986 = fieldWeight in 1401, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1401)
      0.125 = coord(1/8)
    
    Abstract
    Purpose - The purpose of this paper is to claim that it is useful to assess the web connectivity of research groups, describe hyperlink-based techniques to achieve this and present brief details of European life sciences research groups as a case study. Design/methodology/approach - A commercial search engine was harnessed to deliver hyperlink data via its automatic query submission interface. A special purpose link analysis tool, LexiURL, then summarised and graphed the link data in appropriate ways. Findings - Webometrics can provide a wide range of descriptive information about the international connectivity of research groups. Research limitations/implications - Only one field was analysed, data was taken from only one search engine, and the results were not validated. Practical implications - Web connectivity seems to be particularly important for attracting overseas job applicants and to promote research achievements and capabilities, and hence we contend that it can be useful for national and international governments to use webometrics to ensure that the web is being used effectively by research groups. Originality/value - This is the first paper to make a case for the value of using a range of webometric techniques to evaluate the web presences of research groups within a field, and possibly the first "applied" webometrics study produced for an external contract.
  6. Thelwall, M.; Stuart, D.: Web crawling ethics revisited : cost, privacy, and denial of service (2006) 0.01
    0.0074047255 = product of:
      0.059237804 = sum of:
        0.059237804 = weight(_text_:case in 6098) [ClassicSimilarity], result of:
          0.059237804 = score(doc=6098,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 6098, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6098)
      0.125 = coord(1/8)
    
    Abstract
    Ethical aspects of the employment of Web crawlers for information science research and other contexts are reviewed. The difference between legal and ethical uses of communications technologies is emphasized as well as the changing boundary between ethical and unethical conduct. A review of the potential impacts on Web site owners is used to underpin a new framework for ethical crawling, and it is argued that delicate human judgment is required for each individual case, with verdicts likely to change over time. Decisions can be based upon an approximate cost-benefit analysis, but it is crucial that crawler owners find out about the technological issues affecting the owners of the sites being crawled in order to produce an informed assessment.
  7. Vaughan, L.; Thelwall, M.: ¬A modelling approach to uncover hyperlink patterns : the case of Canadian universities (2005) 0.01
    0.0074047255 = product of:
      0.059237804 = sum of:
        0.059237804 = weight(_text_:case in 1014) [ClassicSimilarity], result of:
          0.059237804 = score(doc=1014,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 1014, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1014)
      0.125 = coord(1/8)
    
  8. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.01
    0.0063469075 = product of:
      0.05077526 = sum of:
        0.05077526 = weight(_text_:case in 5896) [ClassicSimilarity], result of:
          0.05077526 = score(doc=5896,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 5896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=5896)
      0.125 = coord(1/8)
    
    Abstract
    Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.
  9. Thelwall, M.; Vann, K.; Fairclough, R.: Web issue analysis : an integrated water resource management case study (2006) 0.01
    0.0063469075 = product of:
      0.05077526 = sum of:
        0.05077526 = weight(_text_:case in 5906) [ClassicSimilarity], result of:
          0.05077526 = score(doc=5906,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 5906, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=5906)
      0.125 = coord(1/8)
    
  10. Thelwall, M.: Extracting accurate and complete results from search engines : case study windows live (2008) 0.01
    0.0063469075 = product of:
      0.05077526 = sum of:
        0.05077526 = weight(_text_:case in 1338) [ClassicSimilarity], result of:
          0.05077526 = score(doc=1338,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=1338)
      0.125 = coord(1/8)
    
  11. Shifman, L.; Thelwall, M.: Assessing global diffusion with Web memetics : the spread and evolution of a popular joke (2009) 0.01
    0.0063469075 = product of:
      0.05077526 = sum of:
        0.05077526 = weight(_text_:case in 3303) [ClassicSimilarity], result of:
          0.05077526 = score(doc=3303,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 3303, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=3303)
      0.125 = coord(1/8)
    
    Abstract
    Memes are small units of culture, analogous to genes, which flow from person to person by copying or imitation. More than any previous medium, the Internet has the technical capabilities for global meme diffusion. Yet, to spread globally, memes need to negotiate their way through cultural and linguistic borders. This article introduces a new broad method, Web memetics, comprising extensive Web searches and combined quantitative and qualitative analyses, to identify and assess: (a) the different versions of a meme, (b) its evolution online, and (c) its Web presence and translation into common Internet languages. This method is demonstrated through one extensively circulated joke about men, women, and computers. The results show that the joke has mutated into several different versions and is widely translated, and that translations incorporate small, local adaptations while retaining the English versions' fundamental components. In conclusion, Web memetics has demonstrated its ability to identify and track the evolution and spread of memes online, with interesting results, albeit for only one case study.
  12. Payne, N.; Thelwall, M.: Mathematical models for academic webs : linear relationship or non-linear power law? (2005) 0.01
    0.006099823 = product of:
      0.048798583 = sum of:
        0.048798583 = weight(_text_:studies in 1066) [ClassicSimilarity], result of:
          0.048798583 = score(doc=1066,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.30860704 = fieldWeight in 1066, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1066)
      0.125 = coord(1/8)
    
    Abstract
    Previous studies of academic web interlinking have tended to hypothesise that the relationship between the research of a university and links to or from its web site should follow a linear trend, yet the typical distribution of web data, in general, seems to be a non-linear power law. This paper assesses whether a linear trend or a power law is the most appropriate method with which to model the relationship between research and web site size or outlinks. Following linear regression, analysis of the confidence intervals for the logarithmic graphs, and analysis of the outliers, the results suggest that a linear trend is more appropriate than a non-linear power law.
  13. Thelwall, M.; Harries, G.: Do the Web Sites of Higher Rated Scholars Have Significantly More Online Impact? (2004) 0.01
    0.0052890894 = product of:
      0.042312715 = sum of:
        0.042312715 = weight(_text_:case in 2123) [ClassicSimilarity], result of:
          0.042312715 = score(doc=2123,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 2123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.125 = coord(1/8)
    
    Abstract
    The quality and impact of academic Web sites is of interest to many audiences, including the scholars who use them and Web educators who need to identify best practice. Several large-scale European Union research projects have been funded to build new indicators for online scientific activity, reflecting recognition of the importance of the Web for scholarly communication. In this paper we address the key question of whether higher rated scholars produce higher impact Web sites, using the United Kingdom as a case study and measuring scholars' quality in terms of university-wide average research ratings. Methodological issues concerning the measurement of the online impact are discussed, leading to the adoption of counts of links to a university's constituent single domain Web sites from an aggregated counting metric. The findings suggest that universities with higher rated scholars produce significantly more Web content but with a similar average online impact. Higher rated scholars therefore attract more total links from their peers, but only by being more prolific, refuting earlier suggestions. It can be surmised that general Web publications are very different from scholarly journal articles and conference papers, for which scholarly quality does associate with citation impact. This has important implications for the construction of new Web indicators, for example that online impact should not be used to assess the quality of small groups of scholars, even within a single discipline.
  14. Thelwall, M.: ¬A layered approach for investigating the topological structure of communities in the Web (2003) 0.01
    0.0052890894 = product of:
      0.042312715 = sum of:
        0.042312715 = weight(_text_:case in 4450) [ClassicSimilarity], result of:
          0.042312715 = score(doc=4450,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 4450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4450)
      0.125 = coord(1/8)
    
    Abstract
    A layered approach for identifying communities in the Web is presented and explored by applying the flake exact community identification algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: the application of alternative document models, shifting the focus from individual pages to aggregated collections based upon Web directories, domains and entire sites; the removal of internal site links; and the adaptation of a new fast algorithm to allow fully-automated community identification using all possible single starting points. The overall topology of the graphs in the three least-aggregated layers was first investigated and found to include a large number of isolated points but, surprisingly, with most of the remainder being in one huge connected component, exact proportions varying by layer. The community identification process then found that the number of communities far exceeded the number of topological components, indicating that community identification is a potentially useful technique, even with random starting points. Both the number and size of communities identified was dependent on the parameter of the algorithm, with very different results being obtained in each case. In conclusion, the UK academic Web is embedded with layers of non-trivial communities and, if it is not unique in this, then there is the promise of improved results for information retrieval algorithms that can exploit this additional structure, and the application of the technique directly to partially automate Web metrics tasks such as that of finding all pages related to a given subject hosted by a single country's universities.
  15. Kousha, K.; Thelwall, M.: Google book search : citation analysis for social science and the humanities (2009) 0.01
    0.0052890894 = product of:
      0.042312715 = sum of:
        0.042312715 = weight(_text_:case in 2946) [ClassicSimilarity], result of:
          0.042312715 = score(doc=2946,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 2946, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2946)
      0.125 = coord(1/8)
    
    Abstract
    In both the social sciences and the humanities, books and monographs play significant roles in research communication. The absence of citations from most books and monographs from the Thomson Reuters/Institute for Scientific Information databases (ISI) has been criticized, but attempts to include citations from or to books in the research evaluation of the social sciences and humanities have not led to widespread adoption. This article assesses whether Google Book Search (GBS) can partially fill this gap by comparing citations from books with citations from journal articles to journal articles in 10 science, social science, and humanities disciplines. Book citations were 31% to 212% of ISI citations and, hence, numerous enough to supplement ISI citations in the social sciences and humanities covered, but not in the sciences (3%-5%), except for computing (46%), due to numerous published conference proceedings. A case study was also made of all 1,923 articles in the 51 information science and library science ISI-indexed journals published in 2003. Within this set, highly book-cited articles tended to receive many ISI citations, indicating a significant relationship between the two types of citation data, but with important exceptions that point to the additional information provided by book citations. In summary, GBS is clearly a valuable new source of citation data for the social sciences and humanities. One practical implication is that book-oriented scholars should consult it for additional citations to their work when applying for promotion and tenure.
  16. Thelwall, M.: Interpreting social science link analysis research : a theoretical framework (2006) 0.01
    0.00522842 = product of:
      0.04182736 = sum of:
        0.04182736 = weight(_text_:studies in 4908) [ClassicSimilarity], result of:
          0.04182736 = score(doc=4908,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.26452032 = fieldWeight in 4908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.046875 = fieldNorm(doc=4908)
      0.125 = coord(1/8)
    
    Abstract
    Link analysis in various forms is now an established technique in many different subjects, reflecting the perceived importance of links and of the Web. A critical but very difficult issue is how to interpret the results of social science link analyses. lt is argued that the dynamic nature of the Web, its lack of quality control, and the online proliferation of copying and imitation mean that methodologies operating within a highly positivist, quantitative framework are ineffective. Conversely, the sheer variety of the Web makes application of qualitative methodologies and pure reason very problematic to large-scale studies. Methodology triangulation is consequently advocated, in combination with a warning that the Web is incapable of giving definitive answers to large-scale link analysis research questions concerning social factors underlying link creation. Finally, it is claimed that although theoretical frameworks are appropriate for guiding research, a Theory of Link Analysis is not possible.
  17. Thelwall, M.: Conceptualizing documentation on the Web : an evaluation of different heuristic-based models for counting links between university Web sites (2002) 0.00
    0.0043570166 = product of:
      0.034856133 = sum of:
        0.034856133 = weight(_text_:studies in 978) [ClassicSimilarity], result of:
          0.034856133 = score(doc=978,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 978, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=978)
      0.125 = coord(1/8)
    
    Abstract
    All known previous Web link studies have used the Web page as the primary indivisible source document for counting purposes. Arguments are presented to explain why this is not necessarily optimal and why other alternatives have the potential to produce better results. This is despite the fact that individual Web files are often the only choice if search engines are used for raw data and are the easiest basic Web unit to identify. The central issue is of defining the Web "document": that which should comprise the single indissoluble unit of coherent material. Three alternative heuristics are defined for the educational arena based upon the directory, the domain and the whole university site. These are then compared by implementing them an a set of 108 UK university institutional Web sites under the assumption that a more effective heuristic will tend to produce results that correlate more highly with institutional research productivity. It was discovered that the domain and directory models were able to successfully reduce the impact of anomalous linking behavior between pairs of Web sites, with the latter being the method of choice. Reasons are then given as to why a document model an its own cannot eliminate all anomalies in Web linking behavior. Finally, the results from all models give a clear confirmation of the very strong association between the research productivity of a UK university and the number of incoming links from its peers' Web sites.
  18. Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.00
    0.0043570166 = product of:
      0.034856133 = sum of:
        0.034856133 = weight(_text_:studies in 4279) [ClassicSimilarity], result of:
          0.034856133 = score(doc=4279,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 4279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4279)
      0.125 = coord(1/8)
    
    Abstract
    Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
  19. Barjak, F.; Thelwall, M.: ¬A statistical analysis of the web presences of European life sciences research teams (2008) 0.00
    0.0043570166 = product of:
      0.034856133 = sum of:
        0.034856133 = weight(_text_:studies in 1383) [ClassicSimilarity], result of:
          0.034856133 = score(doc=1383,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 1383, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1383)
      0.125 = coord(1/8)
    
    Abstract
    Web links have been used for around ten years to explore the online impact of academic information and information producers. Nevertheless, few studies have attempted to relate link counts to relevant offline attributes of the owners of the targeted Web sites, with the exception of research productivity. This article reports the results of a study to relate site inlink counts to relevant owner characteristics for over 400 European life-science research group Web sites. The analysis confirmed that research-group size and Web-presence size were important for attracting Web links, although research productivity was not. Little evidence was found for significant influence of any of an array of factors, including research-group leader gender and industry connections. In addition, the choice of search engine for link data created a surprising international difference in the results, with Google perhaps giving unreliable results. Overall, the data collection, statistical analysis and results interpretation were all complex and it seems that we still need to know more about search engines, hyperlinks, and their function in science before we can draw conclusions on their usefulness and role in the canon of science and technology indicators.
  20. Thelwall, M.: Bibliometrics to webometrics (2009) 0.00
    0.0041342513 = product of:
      0.03307401 = sum of:
        0.03307401 = weight(_text_:libraries in 4239) [ClassicSimilarity], result of:
          0.03307401 = score(doc=4239,freq=2.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.25406548 = fieldWeight in 4239, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4239)
      0.125 = coord(1/8)
    
    Abstract
    Bibliometrics has changed out of all recognition since 1958; becoming established as a field, being taught widely in library and information science schools, and being at the core of a number of science evaluation research groups around the world. This was all made possible by the work of Eugene Garfield and his Science Citation Index. This article reviews the distance that bibliometrics has travelled since 1958 by comparing early bibliometrics with current practice, and by giving an overview of a range of recent developments, such as patent analysis, national research evaluation exercises, visualization techniques, new applications, online citation indexes, and the creation of digital libraries. Webometrics, a modern, fast-growing offshoot of bibliometrics, is reviewed in detail. Finally, future prospects are discussed with regard to both bibliometrics and webometrics.