Search (77 results, page 1 of 4)

  • × author_ss:"Thelwall, M."
  1. Thelwall, M.; Vaughan, L.: Webometrics : an introduction to the special issue (2004) 0.09
    0.085657865 = product of:
      0.17131573 = sum of:
        0.07707134 = weight(_text_:wide in 2908) [ClassicSimilarity], result of:
          0.07707134 = score(doc=2908,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.3916274 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0625 = fieldNorm(doc=2908)
        0.041812565 = weight(_text_:web in 2908) [ClassicSimilarity], result of:
          0.041812565 = score(doc=2908,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.2884563 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=2908)
        0.05243182 = weight(_text_:computer in 2908) [ClassicSimilarity], result of:
          0.05243182 = score(doc=2908,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.32301605 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.0625 = fieldNorm(doc=2908)
      0.5 = coord(3/6)
    
    Abstract
    Webometrics, the quantitative study of Web phenomena, is a field encompassing contributions from information science, computer science, and statistical physics. Its methodology draws especially from bibliometrics. This special issue presents contributions that both push for ward the field and illustrate a wide range of webometric approaches.
  2. Thelwall, M.: Webometrics (2009) 0.05
    0.04692425 = product of:
      0.14077275 = sum of:
        0.057803504 = weight(_text_:wide in 3906) [ClassicSimilarity], result of:
          0.057803504 = score(doc=3906,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 3906, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=3906)
        0.08296924 = weight(_text_:web in 3906) [ClassicSimilarity], result of:
          0.08296924 = score(doc=3906,freq=14.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.57238775 = fieldWeight in 3906, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3906)
      0.33333334 = coord(2/6)
    
    Abstract
    Webometrics is an information science field concerned with measuring aspects of the World Wide Web (WWW) for a variety of information science research goals. It came into existence about five years after the Web was formed and has since grown to become a significant aspect of information science, at least in terms of published research. Although some webometrics research has focused on the structure or evolution of the Web itself or the performance of commercial search engines, most has used data from the Web to shed light on information provision or online communication in various contexts. Most prominently, techniques have been developed to track, map, and assess Web-based informal scholarly communication, for example, in terms of the hyperlinks between academic Web sites or the online impact of digital repositories. In addition, a range of nonacademic issues and groups of Web users have also been analyzed.
  3. Thelwall, M.; Harries, G.: Do the Web Sites of Higher Rated Scholars Have Significantly More Online Impact? (2004) 0.04
    0.042189382 = product of:
      0.12656814 = sum of:
        0.04816959 = weight(_text_:wide in 2123) [ClassicSimilarity], result of:
          0.04816959 = score(doc=2123,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 2123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
        0.078398556 = weight(_text_:web in 2123) [ClassicSimilarity], result of:
          0.078398556 = score(doc=2123,freq=18.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5408555 = fieldWeight in 2123, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.33333334 = coord(2/6)
    
    Abstract
    The quality and impact of academic Web sites is of interest to many audiences, including the scholars who use them and Web educators who need to identify best practice. Several large-scale European Union research projects have been funded to build new indicators for online scientific activity, reflecting recognition of the importance of the Web for scholarly communication. In this paper we address the key question of whether higher rated scholars produce higher impact Web sites, using the United Kingdom as a case study and measuring scholars' quality in terms of university-wide average research ratings. Methodological issues concerning the measurement of the online impact are discussed, leading to the adoption of counts of links to a university's constituent single domain Web sites from an aggregated counting metric. The findings suggest that universities with higher rated scholars produce significantly more Web content but with a similar average online impact. Higher rated scholars therefore attract more total links from their peers, but only by being more prolific, refuting earlier suggestions. It can be surmised that general Web publications are very different from scholarly journal articles and conference papers, for which scholarly quality does associate with citation impact. This has important implications for the construction of new Web indicators, for example that online impact should not be used to assess the quality of small groups of scholars, even within a single discipline.
  4. Kousha, K.; Thelwall, M.: Google Scholar citations and Google Web/URL citations : a multi-discipline exploratory analysis (2007) 0.04
    0.038469743 = product of:
      0.115409225 = sum of:
        0.08263934 = weight(_text_:web in 337) [ClassicSimilarity], result of:
          0.08263934 = score(doc=337,freq=20.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5701118 = fieldWeight in 337, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=337)
        0.03276989 = weight(_text_:computer in 337) [ClassicSimilarity], result of:
          0.03276989 = score(doc=337,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.20188503 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=337)
      0.33333334 = coord(2/6)
    
    Abstract
    We use a new data gathering method, "Web/URL citation," Web/URL and Google Scholar to compare traditional and Web-based citation patterns across multiple disciplines (biology, chemistry, physics, computing, sociology, economics, psychology, and education) based upon a sample of 1,650 articles from 108 open access (OA) journals published in 2001. A Web/URL citation of an online journal article is a Web mention of its title, URL, or both. For each discipline, except psychology, we found significant correlations between Thomson Scientific (formerly Thomson ISI, here: ISI) citations and both Google Scholar and Google Web/URL citations. Google Scholar citations correlated more highly with ISI citations than did Google Web/URL citations, indicating that the Web/URL method measures a broader type of citation phenomenon. Google Scholar citations were more numerous than ISI citations in computer science and the four social science disciplines, suggesting that Google Scholar is more comprehensive for social sciences and perhaps also when conference articles are valued and published online. We also found large disciplinary differences in the percentage overlap between ISI and Google Scholar citation sources. Finally, although we found many significant trends, there were also numerous exceptions, suggesting that replacing traditional citation sources with the Web or Google Scholar for research impact calculations would be problematic.
  5. Thelwall, M.; Buckley, K.; Paltoglou, G.: Sentiment strength detection for the social web (2012) 0.04
    0.037393916 = product of:
      0.112181745 = sum of:
        0.04816959 = weight(_text_:wide in 4972) [ClassicSimilarity], result of:
          0.04816959 = score(doc=4972,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 4972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
        0.064012155 = weight(_text_:web in 4972) [ClassicSimilarity], result of:
          0.064012155 = score(doc=4972,freq=12.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.4416067 = fieldWeight in 4972, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
      0.33333334 = coord(2/6)
    
    Abstract
    Sentiment analysis is concerned with the automatic extraction of sentiment-related information from text. Although most sentiment analysis addresses commercial tasks, such as extracting opinions from product reviews, there is increasing interest in the affective dimension of the social web, and Twitter in particular. Most sentiment analysis algorithms are not ideally suited to this task because they exploit indirect indicators of sentiment that can reflect genre or topic instead. Hence, such algorithms used to process social web texts can identify spurious sentiment patterns caused by topics rather than affective phenomena. This article assesses an improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment. The results from six diverse social web data sets (MySpace, Twitter, YouTube, Digg, Runners World, BBC Forums) indicate that SentiStrength 2 is successful in the sense of performing better than a baseline approach for all data sets in both supervised and unsupervised cases. SentiStrength is not always better than machine-learning approaches that exploit indirect indicators of sentiment, however, and is particularly weaker for positive sentiment in news-related discussions. Overall, the results suggest that, even unsupervised, SentiStrength is robust enough to be applied to a wide variety of different social web contexts.
  6. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.04
    0.03737321 = product of:
      0.11211963 = sum of:
        0.057803504 = weight(_text_:wide in 5896) [ClassicSimilarity], result of:
          0.057803504 = score(doc=5896,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 5896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=5896)
        0.054316122 = weight(_text_:web in 5896) [ClassicSimilarity], result of:
          0.054316122 = score(doc=5896,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.37471575 = fieldWeight in 5896, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5896)
      0.33333334 = coord(2/6)
    
    Abstract
    Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.
  7. Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.04
    0.037056148 = product of:
      0.111168444 = sum of:
        0.078398556 = weight(_text_:web in 4279) [ClassicSimilarity], result of:
          0.078398556 = score(doc=4279,freq=18.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5408555 = fieldWeight in 4279, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4279)
        0.03276989 = weight(_text_:computer in 4279) [ClassicSimilarity], result of:
          0.03276989 = score(doc=4279,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.20188503 = fieldWeight in 4279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4279)
      0.33333334 = coord(2/6)
    
    Abstract
    Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
  8. Thelwall, M.: Text characteristics of English language university Web sites (2005) 0.04
    0.036481895 = product of:
      0.109445676 = sum of:
        0.07012181 = weight(_text_:web in 3463) [ClassicSimilarity], result of:
          0.07012181 = score(doc=3463,freq=10.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.48375595 = fieldWeight in 3463, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3463)
        0.039323866 = weight(_text_:computer in 3463) [ClassicSimilarity], result of:
          0.039323866 = score(doc=3463,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.24226204 = fieldWeight in 3463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.046875 = fieldNorm(doc=3463)
      0.33333334 = coord(2/6)
    
    Abstract
    The nature of the contents of academic Web sites is of direct relevance to the new field of scientific Web intelligence, and for search engine and topic-specific crawler designers. We analyze word frequencies in national academic Webs using the Web sites of three Englishspeaking nations: Australia, New Zealand, and the United Kingdom. Strong regularities were found in page size and word frequency distributions, but with significant anomalies. At least 26% of pages contain no words. High frequency words include university names and acronyms, Internet terminology, and computing product names: not always words in common usage away from the Web. A minority of low frequency words are spelling mistakes, with other common types including nonwords, proper names, foreign language terms or computer science variable names. Based upon these findings, recommendations for data cleansing and filtering are made, particularly for clustering applications.
  9. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.04
    0.03553481 = product of:
      0.10660443 = sum of:
        0.04816959 = weight(_text_:wide in 1401) [ClassicSimilarity], result of:
          0.04816959 = score(doc=1401,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 1401, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1401)
        0.05843484 = weight(_text_:web in 1401) [ClassicSimilarity], result of:
          0.05843484 = score(doc=1401,freq=10.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.40312994 = fieldWeight in 1401, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1401)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose - The purpose of this paper is to claim that it is useful to assess the web connectivity of research groups, describe hyperlink-based techniques to achieve this and present brief details of European life sciences research groups as a case study. Design/methodology/approach - A commercial search engine was harnessed to deliver hyperlink data via its automatic query submission interface. A special purpose link analysis tool, LexiURL, then summarised and graphed the link data in appropriate ways. Findings - Webometrics can provide a wide range of descriptive information about the international connectivity of research groups. Research limitations/implications - Only one field was analysed, data was taken from only one search engine, and the results were not validated. Practical implications - Web connectivity seems to be particularly important for attracting overseas job applicants and to promote research achievements and capabilities, and hence we contend that it can be useful for national and international governments to use webometrics to ensure that the web is being used effectively by research groups. Originality/value - This is the first paper to make a case for the value of using a range of webometric techniques to evaluate the web presences of research groups within a field, and possibly the first "applied" webometrics study produced for an external contract.
  10. Zuccala, A.; Thelwall, M.; Oppenheim, C.; Dhiensa, R.: Web intelligence analyses of digital libraries : a case study of the National electronic Library for Health (NeLH) (2007) 0.03
    0.032879137 = product of:
      0.0986374 = sum of:
        0.07242149 = weight(_text_:web in 838) [ClassicSimilarity], result of:
          0.07242149 = score(doc=838,freq=24.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.49962097 = fieldWeight in 838, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=838)
        0.02621591 = weight(_text_:computer in 838) [ClassicSimilarity], result of:
          0.02621591 = score(doc=838,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.16150802 = fieldWeight in 838, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.03125 = fieldNorm(doc=838)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose - The purpose of this paper is to explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH). Design/methodology/approach - The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text analysis (web content mining), utilizing the power of commercial search engines and drawing upon the information science fields of bibliometrics and webometrics. LexiURL is a computer program designed to calculate summary statistics for lists of links or URLs. Its output is a series of standard reports, for example listing and counting all of the different domain names in the data. Findings - Link data, when analysed together with user transaction log files (i.e. Web referring domains) can provide insights into who is using a digital library and when, and who could be using the digital library if they are "surfing" a particular part of the Web; in this case any site that is linked to or colinked with the NeLH. This study found that the NeLH was embedded in a multifaceted Web context, including many governmental, educational, commercial and organisational sites, with the most interesting being sites from the.edu domain, representing American Universities. Not many links directed to the NeLH were followed on September 25, 2005 (the date of the log file analysis and link extraction analysis), which means that users who access the digital library have been arriving at the site via only a few select links, bookmarks and search engine searches, or non-electronic sources. Originality/value - A number of studies concerning digital library users have been carried out using log file analysis as a research tool. Log files focus on real-time user transactions; while LexiURL can be used to extract links and colinks associated with a digital library's growing Web network. This Web network is not recognized often enough, and can be a useful indication of where potential users are surfing, even if they have not yet specifically visited the NeLH site.
  11. Kousha, K.; Thelwall, M.: How is science cited on the Web? : a classification of google unique Web citations (2007) 0.03
    0.03256127 = product of:
      0.0976838 = sum of:
        0.08263934 = weight(_text_:web in 586) [ClassicSimilarity], result of:
          0.08263934 = score(doc=586,freq=20.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5701118 = fieldWeight in 586, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=586)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 586) [ClassicSimilarity], result of:
              0.030088935 = score(doc=586,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 586, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=586)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Although the analysis of citations in the scholarly literature is now an established and relatively well understood part of information science, not enough is known about citations that can be found on the Web. In particular, are there new Web types, and if so, are these trivial or potentially useful for studying or evaluating research communication? We sought evidence based upon a sample of 1,577 Web citations of the URLs or titles of research articles in 64 open-access journals from biology, physics, chemistry, and computing. Only 25% represented intellectual impact, from references of Web documents (23%) and other informal scholarly sources (2%). Many of the Web/URL citations were created for general or subject-specific navigation (45%) or for self-publicity (22%). Additional analyses revealed significant disciplinary differences in the types of Google unique Web/URL citations as well as some characteristics of scientific open-access publishing on the Web. We conclude that the Web provides access to a new and different type of citation information, one that may therefore enable us to measure different aspects of research, and the research process in particular; but to obtain good information, the different types should be separated.
  12. Thelwall, M.; Klitkou, A.; Verbeek, A.; Stuart, D.; Vincent, C.: Policy-relevant Webometrics for individual scientific fields (2010) 0.03
    0.029720977 = product of:
      0.08916293 = sum of:
        0.057803504 = weight(_text_:wide in 3574) [ClassicSimilarity], result of:
          0.057803504 = score(doc=3574,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 3574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=3574)
        0.031359423 = weight(_text_:web in 3574) [ClassicSimilarity], result of:
          0.031359423 = score(doc=3574,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.21634221 = fieldWeight in 3574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3574)
      0.33333334 = coord(2/6)
    
    Abstract
    Despite over 10 years of research there is no agreement on the most suitable roles for Webometric indicators in support of research policy and almost no field-based Webometrics. This article partly fills these gaps by analyzing the potential of policy-relevant Webometrics for individual scientific fields with the help of 4 case studies. Although Webometrics cannot provide robust indicators of knowledge flows or research impact, it can provide some evidence of networking and mutual awareness. The scope of Webometrics is also relatively wide, including not only research organizations and firms but also intermediary groups like professional associations, Web portals, and government agencies. Webometrics can, therefore, provide evidence about the research process to compliment peer review, bibliometric, and patent indicators: tracking the early, mainly prepublication development of new fields and research funding initiatives, assessing the role and impact of intermediary organizations and the need for new ones, and monitoring the extent of mutual awareness in particular research areas.
  13. Orduna-Malea, E.; Thelwall, M.; Kousha, K.: Web citations in patents : evidence of technological impact? (2017) 0.03
    0.027890932 = product of:
      0.08367279 = sum of:
        0.04434892 = weight(_text_:web in 3764) [ClassicSimilarity], result of:
          0.04434892 = score(doc=3764,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3059541 = fieldWeight in 3764, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3764)
        0.039323866 = weight(_text_:computer in 3764) [ClassicSimilarity], result of:
          0.039323866 = score(doc=3764,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.24226204 = fieldWeight in 3764, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.046875 = fieldNorm(doc=3764)
      0.33333334 = coord(2/6)
    
    Abstract
    Patents sometimes cite webpages either as general background to the problem being addressed or to identify prior publications that limit the scope of the patent granted. Counts of the number of patents citing an organization's website may therefore provide an indicator of its technological capacity or relevance. This article introduces methods to extract URL citations from patents and evaluates the usefulness of counts of patent web citations as a technology indicator. An analysis of patents citing 200 US universities or 177 UK universities found computer science and engineering departments to be frequently cited, as well as research-related webpages, such as Wikipedia, YouTube, or the Internet Archive. Overall, however, patent URL citations seem to be frequent enough to be useful for ranking major US and the top few UK universities if popular hosted subdomains are filtered out, but the hit count estimates on the first search engine results page should not be relied upon for accuracy.
  14. Thelwall, M.; Buckley, K.; Paltoglou, G.; Cai, D.; Kappas, A.: Sentiment strength detection in short informal text (2010) 0.02
    0.021071352 = product of:
      0.063214056 = sum of:
        0.04816959 = weight(_text_:wide in 4200) [ClassicSimilarity], result of:
          0.04816959 = score(doc=4200,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 4200, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4200)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 4200) [ClassicSimilarity], result of:
              0.030088935 = score(doc=4200,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 4200, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4200)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.
    Date
    22. 1.2011 14:29:23
  15. Thelwall, M.; Kousha, K.; Abdoli, M.; Stuart, E.; Makita, M.; Wilson, P.; Levitt, J.: Why are coauthored academic articles more cited : higher quality or larger audience? (2023) 0.02
    0.021071352 = product of:
      0.063214056 = sum of:
        0.04816959 = weight(_text_:wide in 995) [ClassicSimilarity], result of:
          0.04816959 = score(doc=995,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 995, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=995)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 995) [ClassicSimilarity], result of:
              0.030088935 = score(doc=995,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 995, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=995)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Collaboration is encouraged because it is believed to improve academic research, supported by indirect evidence in the form of more coauthored articles being more cited. Nevertheless, this might not reflect quality but increased self-citations or the "audience effect": citations from increased awareness through multiple author networks. We address this with the first science wide investigation into whether author numbers associate with journal article quality, using expert peer quality judgments for 122,331 articles from the 2014-20 UK national assessment. Spearman correlations between author numbers and quality scores show moderately strong positive associations (0.2-0.4) in the health, life, and physical sciences, but weak or no positive associations in engineering and social sciences, with weak negative/positive or no associations in various arts and humanities, and a possible negative association for decision sciences. This gives the first systematic evidence that greater numbers of authors associates with higher quality journal articles in the majority of academia outside the arts and humanities, at least for the UK. Positive associations between team size and citation counts in areas with little association between team size and quality also show that audience effects or other nonquality factors account for the higher citation rates of coauthored articles in some fields.
    Date
    22. 6.2023 18:11:50
  16. Levitt, J.M.; Thelwall, M.: Citation levels and collaboration within library and information science (2009) 0.02
    0.015802983 = product of:
      0.047408946 = sum of:
        0.026132854 = weight(_text_:web in 2734) [ClassicSimilarity], result of:
          0.026132854 = score(doc=2734,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 2734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2734)
        0.02127609 = product of:
          0.04255218 = sum of:
            0.04255218 = weight(_text_:22 in 2734) [ClassicSimilarity], result of:
              0.04255218 = score(doc=2734,freq=4.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.27358043 = fieldWeight in 2734, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2734)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Collaboration is a major research policy objective, but does it deliver higher quality research? This study uses citation analysis to examine the Web of Science (WoS) Information Science & Library Science subject category (IS&LS) to ascertain whether, in general, more highly cited articles are more highly collaborative than other articles. It consists of two investigations. The first investigation is a longitudinal comparison of the degree and proportion of collaboration in five strata of citation; it found that collaboration in the highest four citation strata (all in the most highly cited 22%) increased in unison over time, whereas collaboration in the lowest citation strata (un-cited articles) remained low and stable. Given that over 40% of the articles were un-cited, it seems important to take into account the differences found between un-cited articles and relatively highly cited articles when investigating collaboration in IS&LS. The second investigation compares collaboration for 35 influential information scientists; it found that their more highly cited articles on average were not more highly collaborative than their less highly cited articles. In summary, although collaborative research is conducive to high citation in general, collaboration has apparently not tended to be essential to the success of current and former elite information scientists.
    Date
    22. 3.2009 12:43:51
  17. Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.01
    0.0147829745 = product of:
      0.08869784 = sum of:
        0.08869784 = weight(_text_:web in 674) [ClassicSimilarity], result of:
          0.08869784 = score(doc=674,freq=16.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.6119082 = fieldWeight in 674, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=674)
      0.16666667 = coord(1/6)
    
    Abstract
    Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
  18. Thelwall, M.: Conceptualizing documentation on the Web : an evaluation of different heuristic-based models for counting links between university Web sites (2002) 0.01
    0.01444548 = product of:
      0.08667288 = sum of:
        0.08667288 = weight(_text_:web in 978) [ClassicSimilarity], result of:
          0.08667288 = score(doc=978,freq=22.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.59793836 = fieldWeight in 978, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=978)
      0.16666667 = coord(1/6)
    
    Abstract
    All known previous Web link studies have used the Web page as the primary indivisible source document for counting purposes. Arguments are presented to explain why this is not necessarily optimal and why other alternatives have the potential to produce better results. This is despite the fact that individual Web files are often the only choice if search engines are used for raw data and are the easiest basic Web unit to identify. The central issue is of defining the Web "document": that which should comprise the single indissoluble unit of coherent material. Three alternative heuristics are defined for the educational arena based upon the directory, the domain and the whole university site. These are then compared by implementing them an a set of 108 UK university institutional Web sites under the assumption that a more effective heuristic will tend to produce results that correlate more highly with institutional research productivity. It was discovered that the domain and directory models were able to successfully reduce the impact of anomalous linking behavior between pairs of Web sites, with the latter being the method of choice. Reasons are then given as to why a document model an its own cannot eliminate all anomalies in Web linking behavior. Finally, the results from all models give a clear confirmation of the very strong association between the research productivity of a UK university and the number of incoming links from its peers' Web sites.
  19. Thelwall, M.; Thelwall, S.: ¬A thematic analysis of highly retweeted early COVID-19 tweets : consensus, information, dissent and lockdown life (2020) 0.01
    0.013725774 = product of:
      0.04117732 = sum of:
        0.026132854 = weight(_text_:web in 178) [ClassicSimilarity], result of:
          0.026132854 = score(doc=178,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=178)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 178) [ClassicSimilarity], result of:
              0.030088935 = score(doc=178,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 178, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=178)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose Public attitudes towards COVID-19 and social distancing are critical in reducing its spread. It is therefore important to understand public reactions and information dissemination in all major forms, including on social media. This article investigates important issues reflected on Twitter in the early stages of the public reaction to COVID-19. Design/methodology/approach A thematic analysis of the most retweeted English-language tweets mentioning COVID-19 during March 10-29, 2020. Findings The main themes identified for the 87 qualifying tweets accounting for 14 million retweets were: lockdown life; attitude towards social restrictions; politics; safety messages; people with COVID-19; support for key workers; work; and COVID-19 facts/news. Research limitations/implications Twitter played many positive roles, mainly through unofficial tweets. Users shared social distancing information, helped build support for social distancing, criticised government responses, expressed support for key workers and helped each other cope with social isolation. A few popular tweets not supporting social distancing show that government messages sometimes failed. Practical implications Public health campaigns in future may consider encouraging grass roots social web activity to support campaign goals. At a methodological level, analysing retweet counts emphasised politics and ignored practical implementation issues. Originality/value This is the first qualitative analysis of general COVID-19-related retweeting.
    Date
    20. 1.2015 18:30:22
  20. Vaughan, L.; Thelwall, M.: Scholarly use of the Web : what are the key inducers of links to journal Web sites? (2003) 0.01
    0.013066426 = product of:
      0.078398556 = sum of:
        0.078398556 = weight(_text_:web in 1236) [ClassicSimilarity], result of:
          0.078398556 = score(doc=1236,freq=18.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5408555 = fieldWeight in 1236, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1236)
      0.16666667 = coord(1/6)
    
    Abstract
    Web links have been studied by information scientists for at least six years but it is only in the past two that clear evidence has emerged to show that counts of links to scholarly Web spaces (universities and departments) can correlate significantly with research measures, giving some credence to their use for the investigation of scholarly communication. This paper reports an a study to investigate the factors that influence the creation of links to journal Web sites. An empirical approach is used: collecting data and testing for significant patterns. The specific questions addressed are whether site age and site content are inducers of links to a journal's Web site as measured by the ratio of link counts to Journal Impact Factors, two variables previously discovered to be related. A new methodology for data collection is also introduced that uses the Internet Archive to obtain an earliest known creation date for Web sites. The results show that both site age and site content are significant factors for the disciplines studied: library and information science, and law. Comparisons between the two fields also show disciplinary differences in Web site characteristics. Scholars and publishers should be particularly aware that richer content an a journal's Web site tends to generate links and thus the traffic to the site.