Search (59 results, page 1 of 3)

  • × author_ss:"Thelwall, M."
  1. Price, L.; Thelwall, M.: ¬The clustering power of low frequency words in academic webs (2005) 0.14
    0.14488001 = product of:
      0.28976002 = sum of:
        0.00823978 = product of:
          0.03295912 = sum of:
            0.03295912 = weight(_text_:based in 3561) [ClassicSimilarity], result of:
              0.03295912 = score(doc=3561,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23302436 = fieldWeight in 3561, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3561)
          0.25 = coord(1/4)
        0.28152025 = weight(_text_:frequency in 3561) [ClassicSimilarity], result of:
          0.28152025 = score(doc=3561,freq=10.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            1.0183823 = fieldWeight in 3561, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3561)
      0.5 = coord(2/4)
    
    Abstract
    The value of low frequency words for subject-based academic Web site clustering is assessed. A new technique is introduced to compare the relative clustering power of different vocabularies. The technique is designed for word frequency tests in large document clustering exercises. Results for the Australian and New Zealand academic Web spaces indicate that low frequency words are useful for clustering academic Web sites along subject lines; removing low frequency words results in sites becoming, an average, less dissimilar to sites from other subjects.
  2. Thelwall, M.: Text characteristics of English language university Web sites (2005) 0.10
    0.096987605 = product of:
      0.19397521 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 3463) [ClassicSimilarity], result of:
              0.028250674 = score(doc=3463,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 3463, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3463)
          0.25 = coord(1/4)
        0.18691254 = weight(_text_:frequency in 3463) [ClassicSimilarity], result of:
          0.18691254 = score(doc=3463,freq=6.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.6761447 = fieldWeight in 3463, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.046875 = fieldNorm(doc=3463)
      0.5 = coord(2/4)
    
    Abstract
    The nature of the contents of academic Web sites is of direct relevance to the new field of scientific Web intelligence, and for search engine and topic-specific crawler designers. We analyze word frequencies in national academic Webs using the Web sites of three Englishspeaking nations: Australia, New Zealand, and the United Kingdom. Strong regularities were found in page size and word frequency distributions, but with significant anomalies. At least 26% of pages contain no words. High frequency words include university names and acronyms, Internet terminology, and computing product names: not always words in common usage away from the Web. A minority of low frequency words are spelling mistakes, with other common types including nonwords, proper names, foreign language terms or computer science variable names. Based upon these findings, recommendations for data cleansing and filtering are made, particularly for clustering applications.
  3. Thelwall, M.; Buckley, K.; Paltoglou, G.; Cai, D.; Kappas, A.: Sentiment strength detection in short informal text (2010) 0.06
    0.058685057 = product of:
      0.07824674 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 4200) [ClassicSimilarity], result of:
              0.023542227 = score(doc=4200,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 4200, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4200)
          0.25 = coord(1/4)
        0.056460675 = weight(_text_:term in 4200) [ClassicSimilarity], result of:
          0.056460675 = score(doc=4200,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.25776416 = fieldWeight in 4200, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4200)
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 4200) [ClassicSimilarity], result of:
              0.031801023 = score(doc=4200,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 4200, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4200)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.
    Date
    22. 1.2011 14:29:23
  4. Thelwall, M.; Prabowo, R.; Fairclough, R.: Are raw RSS feeds suitable for broad issue scanning? : a science concern case study (2006) 0.05
    0.047906943 = product of:
      0.095813885 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 6116) [ClassicSimilarity], result of:
              0.023542227 = score(doc=6116,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 6116, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6116)
          0.25 = coord(1/4)
        0.08992833 = weight(_text_:frequency in 6116) [ClassicSimilarity], result of:
          0.08992833 = score(doc=6116,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.32531026 = fieldWeight in 6116, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6116)
      0.5 = coord(2/4)
    
    Abstract
    Broad issue scanning is the task of identifying important public debates arising in a given broad issue; really simple syndication (RSS) feeds are a natural information source for investigating broad issues. RSS, as originally conceived, is a method for publishing timely and concise information on the Internet, for example, about the main stories in a news site or the latest postings in a blog. RSS feeds are potentially a nonintrusive source of high-quality data about public opinion: Monitoring a large number may allow quantitative methods to extract information relevant to a given need. In this article we describe an RSS feed-based coword frequency method to identify bursts of discussion relevant to a given broad issue. A case study of public science concerns is used to demonstrate the method and assess the suitability of raw RSS feeds for broad issue scanning (i.e., without data cleansing). An attempt to identify genuine science concern debates from the corpus through investigating the top 1,000 "burst" words found only two genuine debates, however. The low success rate was mainly caused by a few pathological feeds that dominated the results and obscured any significant debates. The results point to the need to develop effective data cleansing procedures for RSS feeds, particularly if there is not a large quantity of discussion about the broad issue, and a range of potential techniques is suggested. Finally, the analysis confirmed that the time series information generated by real-time monitoring of RSS feeds could usefully illustrate the evolution of new debates relevant to a broad issue.
  5. Thelwall, M.; Buckley, K.; Paltoglou, G.: Sentiment in Twitter events (2011) 0.04
    0.04341671 = product of:
      0.08683342 = sum of:
        0.06775281 = weight(_text_:term in 4345) [ClassicSimilarity], result of:
          0.06775281 = score(doc=4345,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 4345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=4345)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 4345) [ClassicSimilarity], result of:
              0.038161222 = score(doc=4345,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 4345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4345)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The microblogging site Twitter generates a constant stream of communication, some of which concerns events of general interest. An analysis of Twitter may, therefore, give insights into why particular events resonate with the population. This article reports a study of a month of English Twitter posts, assessing whether popular events are typically associated with increases in sentiment strength, as seems intuitively likely. Using the top 30 events, determined by a measure of relative increase in (general) term usage, the results give strong evidence that popular events are normally associated with increases in negative sentiment strength and some evidence that peaks of interest in events have stronger positive sentiment than the time before the peak. It seems that many positive events, such as the Oscars, are capable of generating increased negative sentiment in reaction to them. Nevertheless, the surprisingly small average change in sentiment associated with popular events (typically 1% and only 6% for Tiger Woods' confessions) is consistent with events affording posters opportunities to satisfy pre-existing personal goals more often than eliciting instinctive reactions.
    Date
    22. 1.2011 14:27:06
  6. Didegah, F.; Thelwall, M.: Co-saved, co-tweeted, and co-cited networks (2018) 0.03
    0.03325464 = product of:
      0.13301855 = sum of:
        0.13301855 = sum of:
          0.094857335 = weight(_text_:assessment in 4291) [ClassicSimilarity], result of:
            0.094857335 = score(doc=4291,freq=2.0), product of:
              0.25917634 = queryWeight, product of:
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.04694356 = queryNorm
              0.36599535 = fieldWeight in 4291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.046875 = fieldNorm(doc=4291)
          0.038161222 = weight(_text_:22 in 4291) [ClassicSimilarity], result of:
            0.038161222 = score(doc=4291,freq=2.0), product of:
              0.16438834 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04694356 = queryNorm
              0.23214069 = fieldWeight in 4291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=4291)
      0.25 = coord(1/4)
    
    Abstract
    Counts of tweets and Mendeley user libraries have been proposed as altmetric alternatives to citation counts for the impact assessment of articles. Although both have been investigated to discover whether they correlate with article citations, it is not known whether users tend to tweet or save (in Mendeley) the same kinds of articles that they cite. In response, this article compares pairs of articles that are tweeted, saved to a Mendeley library, or cited by the same user, but possibly a different user for each source. The study analyzes 1,131,318 articles published in 2012, with minimum tweeted (10), saved to Mendeley (100), and cited (10) thresholds. The results show surprisingly minor overall overlaps between the three phenomena. The importance of journals for Twitter and the presence of many bots at different levels of activity suggest that this site has little value for impact altmetrics. The moderate differences between patterns of saving and citation suggest that Mendeley can be used for some types of impact assessments, but sensitivity is needed for underlying differences.
    Date
    28. 7.2018 10:00:22
  7. Kousha, K.; Thelwall, M.; Rezaie, S.: Assessing the citation impact of books : the role of Google Books, Google Scholar, and Scopus (2011) 0.03
    0.033044655 = product of:
      0.06608931 = sum of:
        0.010194084 = product of:
          0.040776335 = sum of:
            0.040776335 = weight(_text_:based in 4920) [ClassicSimilarity], result of:
              0.040776335 = score(doc=4920,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28829288 = fieldWeight in 4920, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4920)
          0.25 = coord(1/4)
        0.055895224 = product of:
          0.11179045 = sum of:
            0.11179045 = weight(_text_:assessment in 4920) [ClassicSimilarity], result of:
              0.11179045 = score(doc=4920,freq=4.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.43132967 = fieldWeight in 4920, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4920)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Citation indictors are increasingly used in some subject areas to support peer review in the evaluation of researchers and departments. Nevertheless, traditional journal-based citation indexes may be inadequate for the citation impact assessment of book-based disciplines. This article examines whether online citations from Google Books and Google Scholar can provide alternative sources of citation evidence. To investigate this, we compared the citation counts to 1,000 books submitted to the 2008 U.K. Research Assessment Exercise (RAE) from Google Books and Google Scholar with Scopus citations across seven book-based disciplines (archaeology; law; politics and international studies; philosophy; sociology; history; and communication, cultural, and media studies). Google Books and Google Scholar citations to books were 1.4 and 3.2 times more common than were Scopus citations, and their medians were more than twice and three times as high as were Scopus median citations, respectively. This large number of citations is evidence that in book-oriented disciplines in the social sciences, arts, and humanities, online book citations may be sufficiently numerous to support peer review for research evaluation, at least in the United Kingdom.
  8. Thelwall, M.; Kousha, K.; Abdoli, M.; Stuart, E.; Makita, M.; Wilson, P.; Levitt, J.: Do altmetric scores reflect article quality? : evidence from the UK Research Excellence Framework 2021 (2023) 0.03
    0.032109328 = product of:
      0.064218655 = sum of:
        0.008323434 = product of:
          0.033293735 = sum of:
            0.033293735 = weight(_text_:based in 947) [ClassicSimilarity], result of:
              0.033293735 = score(doc=947,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23539014 = fieldWeight in 947, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=947)
          0.25 = coord(1/4)
        0.055895224 = product of:
          0.11179045 = sum of:
            0.11179045 = weight(_text_:assessment in 947) [ClassicSimilarity], result of:
              0.11179045 = score(doc=947,freq=4.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.43132967 = fieldWeight in 947, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=947)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Altmetrics are web-based quantitative impact or attention indicators for academic articles that have been proposed to supplement citation counts. This article reports the first assessment of the extent to which mature altmetrics from Altmetric.com and Mendeley associate with individual article quality scores. It exploits expert norm-referenced peer review scores from the UK Research Excellence Framework 2021 for 67,030+ journal articles in all fields 2014-2017/2018, split into 34 broadly field-based Units of Assessment (UoAs). Altmetrics correlated more strongly with research quality than previously found, although less strongly than raw and field normalized Scopus citation counts. Surprisingly, field normalizing citation counts can reduce their strength as a quality indicator for articles in a single field. For most UoAs, Mendeley reader counts are the best altmetric (e.g., three Spearman correlations with quality scores above 0.5), tweet counts are also a moderate strength indicator in eight UoAs (Spearman correlations with quality scores above 0.3), ahead of news (eight correlations above 0.3, but generally weaker), blogs (five correlations above 0.3), and Facebook (three correlations above 0.3) citations, at least in the United Kingdom. In general, altmetrics are the strongest indicators of research quality in the health and physical sciences and weakest in the arts and humanities.
  9. Thelwall, M.; Stuart, D.: Web crawling ethics revisited : cost, privacy, and denial of service (2006) 0.03
    0.031786613 = product of:
      0.06357323 = sum of:
        0.00823978 = product of:
          0.03295912 = sum of:
            0.03295912 = weight(_text_:based in 6098) [ClassicSimilarity], result of:
              0.03295912 = score(doc=6098,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23302436 = fieldWeight in 6098, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6098)
          0.25 = coord(1/4)
        0.055333447 = product of:
          0.11066689 = sum of:
            0.11066689 = weight(_text_:assessment in 6098) [ClassicSimilarity], result of:
              0.11066689 = score(doc=6098,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.4269946 = fieldWeight in 6098, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6098)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Ethical aspects of the employment of Web crawlers for information science research and other contexts are reviewed. The difference between legal and ethical uses of communications technologies is emphasized as well as the changing boundary between ethical and unethical conduct. A review of the potential impacts on Web site owners is used to underpin a new framework for ethical crawling, and it is argued that delicate human judgment is required for each individual case, with verdicts likely to change over time. Decisions can be based upon an approximate cost-benefit analysis, but it is crucial that crawler owners find out about the technological issues affecting the owners of the sites being crawled in order to produce an informed assessment.
  10. Thelwall, M.; Kousha, K.; Abdoli, M.; Stuart, E.; Makita, M.; Wilson, P.; Levitt, J.: Why are coauthored academic articles more cited : higher quality or larger audience? (2023) 0.03
    0.027712202 = product of:
      0.11084881 = sum of:
        0.11084881 = sum of:
          0.079047784 = weight(_text_:assessment in 995) [ClassicSimilarity], result of:
            0.079047784 = score(doc=995,freq=2.0), product of:
              0.25917634 = queryWeight, product of:
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.04694356 = queryNorm
              0.30499613 = fieldWeight in 995, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0390625 = fieldNorm(doc=995)
          0.031801023 = weight(_text_:22 in 995) [ClassicSimilarity], result of:
            0.031801023 = score(doc=995,freq=2.0), product of:
              0.16438834 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04694356 = queryNorm
              0.19345059 = fieldWeight in 995, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=995)
      0.25 = coord(1/4)
    
    Abstract
    Collaboration is encouraged because it is believed to improve academic research, supported by indirect evidence in the form of more coauthored articles being more cited. Nevertheless, this might not reflect quality but increased self-citations or the "audience effect": citations from increased awareness through multiple author networks. We address this with the first science wide investigation into whether author numbers associate with journal article quality, using expert peer quality judgments for 122,331 articles from the 2014-20 UK national assessment. Spearman correlations between author numbers and quality scores show moderately strong positive associations (0.2-0.4) in the health, life, and physical sciences, but weak or no positive associations in engineering and social sciences, with weak negative/positive or no associations in various arts and humanities, and a possible negative association for decision sciences. This gives the first systematic evidence that greater numbers of authors associates with higher quality journal articles in the majority of academia outside the arts and humanities, at least for the UK. Positive associations between team size and citation counts in areas with little association between team size and quality also show that audience effects or other nonquality factors account for the higher citation rates of coauthored articles in some fields.
    Date
    22. 6.2023 18:11:50
  11. Thelwall, M.: Web indicators for research evaluation : a practical guide (2016) 0.02
    0.023923663 = product of:
      0.047847327 = sum of:
        0.008323434 = product of:
          0.033293735 = sum of:
            0.033293735 = weight(_text_:based in 3384) [ClassicSimilarity], result of:
              0.033293735 = score(doc=3384,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23539014 = fieldWeight in 3384, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3384)
          0.25 = coord(1/4)
        0.039523892 = product of:
          0.079047784 = sum of:
            0.079047784 = weight(_text_:assessment in 3384) [ClassicSimilarity], result of:
              0.079047784 = score(doc=3384,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30499613 = fieldWeight in 3384, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3384)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In recent years there has been an increasing demand for research evaluation within universities and other research-based organisations. In parallel, there has been an increasing recognition that traditional citation-based indicators are not able to reflect the societal impacts of research and are slow to appear. This has led to the creation of new indicators for different types of research impact as well as timelier indicators, mainly derived from the Web. These indicators have been called altmetrics, webometrics or just web metrics. This book describes and evaluates a range of web indicators for aspects of societal or scholarly impact, discusses the theory and practice of using and evaluating web indicators for research assessment and outlines practical strategies for obtaining many web indicators. In addition to describing impact indicators for traditional scholarly outputs, such as journal articles and monographs, it also covers indicators for videos, datasets, software and other non-standard scholarly outputs. The book describes strategies to analyse web indicators for individual publications as well as to compare the impacts of groups of publications. The practical part of the book includes descriptions of how to use the free software Webometric Analyst to gather and analyse web data. This book is written for information science undergraduate and Master?s students that are learning about alternative indicators or scientometrics as well as Ph.D. students and other researchers and practitioners using indicators to help assess research impact or to study scholarly communication.
  12. Thelwall, M.; Kousha, K.; Stuart, E.; Makita, M.; Abdoli, M.; Wilson, P.; Levitt, J.: In which fields are citations indicators of research quality? (2023) 0.02
    0.022704724 = product of:
      0.04540945 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 1033) [ClassicSimilarity], result of:
              0.023542227 = score(doc=1033,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 1033, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1033)
          0.25 = coord(1/4)
        0.039523892 = product of:
          0.079047784 = sum of:
            0.079047784 = weight(_text_:assessment in 1033) [ClassicSimilarity], result of:
              0.079047784 = score(doc=1033,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30499613 = fieldWeight in 1033, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1033)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Citation counts are widely used as indicators of research quality to support or replace human peer review and for lists of top cited papers, researchers, and institutions. Nevertheless, the relationship between citations and research quality is poorly evidenced. We report the first large-scale science-wide academic evaluation of the relationship between research quality and citations (field normalized citation counts), correlating them for 87,739 journal articles in 34 field-based UK Units of Assessment (UoA). The two correlate positively in all academic fields, from very weak (0.1) to strong (0.5), reflecting broadly linear relationships in all fields. We give the first evidence that the correlations are positive even across the arts and humanities. The patterns are similar for the field classification schemes of Scopus and Dimensions.ai, although varying for some individual subjects and therefore more uncertain for these. We also show for the first time that no field has a citation threshold beyond which all articles are excellent quality, so lists of top cited articles are not pure collections of excellence, and neither is any top citation percentile indicator. Thus, while appropriately field normalized citations associate positively with research quality in all fields, they never perfectly reflect it, even at high values.
  13. Thelwall, M.; Sud, P.: Do new research issues attract more citations? : a comparison between 25 Scopus subject categories (2021) 0.02
    0.019961866 = product of:
      0.07984746 = sum of:
        0.07984746 = weight(_text_:term in 157) [ClassicSimilarity], result of:
          0.07984746 = score(doc=157,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.3645336 = fieldWeight in 157, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=157)
      0.25 = coord(1/4)
    
    Abstract
    Finding new ways to help researchers and administrators understand academic fields is an important task for information scientists. Given the importance of interdisciplinary research, it is essential to be aware of disciplinary differences in aspects of scholarship, such as the significance of recent changes in a field. This paper identifies potential changes in 25 subject categories through a term comparison of words in article titles, keywords and abstracts in 1 year compared to the previous 4 years. The scholarly influence of new research issues is indirectly assessed with a citation analysis of articles matching each trending term. While topic-related words dominate the top terms, style, national focus, and language changes are also evident. Thus, as reflected in Scopus, fields evolve along multiple dimensions. Moreover, while articles exploiting new issues are usually more cited in some fields, such as Organic Chemistry, they are usually less cited in others, including History. The possible causes of new issues being less cited include externally driven temporary factors, such as disease outbreaks, and internally driven temporary decisions, such as a deliberate emphasis on a single topic (e.g., through a journal special issue).
  14. Thelwall, M.; Kousha, K.: SlideShare presentations, citations, users, and trends : a professional site with academic and educational uses (2017) 0.01
    0.014115169 = product of:
      0.056460675 = sum of:
        0.056460675 = weight(_text_:term in 3766) [ClassicSimilarity], result of:
          0.056460675 = score(doc=3766,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.25776416 = fieldWeight in 3766, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3766)
      0.25 = coord(1/4)
    
    Abstract
    SlideShare is a free social website that aims to help users distribute and find presentations. Owned by LinkedIn since 2012, it targets a professional audience but may give value to scholarship through creating a long-term record of the content of talks. This article tests this hypothesis by analyzing sets of general and scholarly related SlideShare documents using content and citation analysis and popularity statistics reported on the site. The results suggest that academics, students, and teachers are a minority of SlideShare uploaders, especially since 2010, with most documents not being directly related to scholarship or teaching. About two thirds of uploaded SlideShare documents are presentation slides, with the remainder often being files associated with presentations or video recordings of talks. SlideShare is therefore a presentation-centered site with a predominantly professional user base. Although a minority of the uploaded SlideShare documents are cited by, or cite, academic publications, probably too few articles are cited by SlideShare to consider extracting SlideShare citations for research evaluation. Nevertheless, scholars should consider SlideShare to be a potential source of academic and nonacademic information, particularly in library and information science, education, and business.
  15. Thelwall, M.: Female citation impact superiority 1996-2018 in six out of seven English-speaking nations (2020) 0.01
    0.014115169 = product of:
      0.056460675 = sum of:
        0.056460675 = weight(_text_:term in 5948) [ClassicSimilarity], result of:
          0.056460675 = score(doc=5948,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.25776416 = fieldWeight in 5948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5948)
      0.25 = coord(1/4)
    
    Abstract
    Efforts to combat continuing gender inequalities in academia need to be informed by evidence about where differences occur. Citations are relevant as potential evidence in appointment and promotion decisions, but it is unclear whether there have been historical gender differences in average citation impact that might explain the current shortfall of senior female academics. This study investigates the evolution of gender differences in citation impact 1996-2018 for six million articles from seven large English-speaking nations: Australia, Canada, Ireland, Jamaica, New Zealand, UK, and the USA. The results show that a small female citation advantage has been the norm over time for all these countries except the USA, where there has been no practical difference. The female citation advantage is largest, and statistically significant in most years, for Australia and the UK. This suggests that any academic bias against citing female-authored research cannot explain current employment inequalities. Nevertheless, comparisons using recent citation data, or avoiding it altogether, during appointments or promotion may disadvantage females in some countries by underestimating the likely greater impact of their work, especially in the long term.
  16. Thelwall, M.; Sud, P.: Mendeley readership counts : an investigation of temporal and disciplinary differences (2016) 0.01
    0.013071639 = product of:
      0.026143279 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 3211) [ClassicSimilarity], result of:
              0.028250674 = score(doc=3211,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 3211, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3211)
          0.25 = coord(1/4)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 3211) [ClassicSimilarity], result of:
              0.038161222 = score(doc=3211,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 3211, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3211)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Scientists and managers using citation-based indicators to help evaluate research cannot evaluate recent articles because of the time needed for citations to accrue. Reading occurs before citing, however, and so it makes sense to count readers rather than citations for recent publications. To assess this, Mendeley readers and citations were obtained for articles from 2004 to late 2014 in five broad categories (agriculture, business, decision science, pharmacy, and the social sciences) and 50 subcategories. In these areas, citation counts tended to increase with every extra year since publication, and readership counts tended to increase faster initially but then stabilize after about 5 years. The correlation between citations and readers was also higher for longer time periods, stabilizing after about 5 years. Although there were substantial differences between broad fields and smaller differences between subfields, the results confirm the value of Mendeley reader counts as early scientific impact indicators.
    Date
    16.11.2016 11:07:22
  17. Thelwall, M.; Sud, P.; Wilkinson, D.: Link and co-inlink network diagrams with URL citations or title mentions (2012) 0.01
    0.0130472975 = product of:
      0.026094595 = sum of:
        0.010194084 = product of:
          0.040776335 = sum of:
            0.040776335 = weight(_text_:based in 57) [ClassicSimilarity], result of:
              0.040776335 = score(doc=57,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28829288 = fieldWeight in 57, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=57)
          0.25 = coord(1/4)
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 57) [ClassicSimilarity], result of:
              0.031801023 = score(doc=57,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 57, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=57)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Webometric network analyses have been used to map the connectivity of groups of websites to identify clusters, important sites or overall structure. Such analyses have mainly been based upon hyperlink counts, the number of hyperlinks between a pair of websites, although some have used title mentions or URL citations instead. The ability to automatically gather hyperlink counts from Yahoo! ceased in April 2011 and the ability to manually gather such counts was due to cease by early 2012, creating a need for alternatives. This article assesses URL citations and title mentions as possible replacements for hyperlinks in both binary and weighted direct link and co-inlink network diagrams. It also assesses three different types of data for the network connections: hit count estimates, counts of matching URLs, and filtered counts of matching URLs. Results from analyses of U.S. library and information science departments and U.K. universities give evidence that metrics based upon URLs or titles can be appropriate replacements for metrics based upon hyperlinks for both binary and weighted networks, although filtered counts of matching URLs are necessary to give the best results for co-title mention and co-URL citation network diagrams.
    Date
    6. 4.2012 18:16:22
  18. Kousha, K.; Thelwall, M.: How is science cited on the Web? : a classification of google unique Web citations (2007) 0.01
    0.010893034 = product of:
      0.021786068 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 586) [ClassicSimilarity], result of:
              0.023542227 = score(doc=586,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 586, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=586)
          0.25 = coord(1/4)
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 586) [ClassicSimilarity], result of:
              0.031801023 = score(doc=586,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 586, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=586)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Although the analysis of citations in the scholarly literature is now an established and relatively well understood part of information science, not enough is known about citations that can be found on the Web. In particular, are there new Web types, and if so, are these trivial or potentially useful for studying or evaluating research communication? We sought evidence based upon a sample of 1,577 Web citations of the URLs or titles of research articles in 64 open-access journals from biology, physics, chemistry, and computing. Only 25% represented intellectual impact, from references of Web documents (23%) and other informal scholarly sources (2%). Many of the Web/URL citations were created for general or subject-specific navigation (45%) or for self-publicity (22%). Additional analyses revealed significant disciplinary differences in the types of Google unique Web/URL citations as well as some characteristics of scientific open-access publishing on the Web. We conclude that the Web provides access to a new and different type of citation information, one that may therefore enable us to measure different aspects of research, and the research process in particular; but to obtain good information, the different types should be separated.
  19. Thelwall, M.; Kousha, K.: Online presentations as a source of scientific impact? : an analysis of PowerPoint files citing academic journals (2008) 0.01
    0.009880973 = product of:
      0.039523892 = sum of:
        0.039523892 = product of:
          0.079047784 = sum of:
            0.079047784 = weight(_text_:assessment in 1614) [ClassicSimilarity], result of:
              0.079047784 = score(doc=1614,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30499613 = fieldWeight in 1614, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1614)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Open-access online publication has made available an increasingly wide range of document types for scientometric analysis. In this article, we focus on citations in online presentations, seeking evidence of their value as nontraditional indicators of research impact. For this purpose, we searched for online PowerPoint files mentioning any one of 1,807 ISI-indexed journals in ten science and ten social science disciplines. We also manually classified 1,378 online PowerPoint citations to journals in eight additional science and social science disciplines. The results showed that very few journals were cited frequently enough in online PowerPoint files to make impact assessment worthwhile, with the main exceptions being popular magazines like Scientific American and Harvard Business Review. Surprisingly, however, there was little difference overall in the number of PowerPoint citations to science and to the social sciences, and also in the proportion representing traditional impact (about 60%) and wider impact (about 15%). It seems that the main scientometric value for online presentations may be in tracking the popularization of research, or for comparing the impact of whole journals rather than individual articles.
  20. Kousha, K.; Thelwall, M.: ¬An automatic method for extracting citations from Google Books (2015) 0.01
    0.009880973 = product of:
      0.039523892 = sum of:
        0.039523892 = product of:
          0.079047784 = sum of:
            0.079047784 = weight(_text_:assessment in 1658) [ClassicSimilarity], result of:
              0.079047784 = score(doc=1658,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.30499613 = fieldWeight in 1658, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1658)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Recent studies have shown that counting citations from books can help scholarly impact assessment and that Google Books (GB) is a useful source of such citation counts, despite its lack of a public citation index. Searching GB for citations produces approximate matches, however, and so its raw results need time-consuming human filtering. In response, this article introduces a method to automatically remove false and irrelevant matches from GB citation searches in addition to introducing refinements to a previous GB manual citation extraction method. The method was evaluated by manual checking of sampled GB results and comparing citations to about 14,500 monographs in the Thomson Reuters Book Citation Index (BKCI) against automatically extracted citations from GB across 24 subject areas. GB citations were 103% to 137% as numerous as BKCI citations in the humanities, except for tourism (72%) and linguistics (91%), 46% to 85% in social sciences, but only 8% to 53% in the sciences. In all cases, however, GB had substantially more citing books than did BKCI, with BKCI's results coming predominantly from journal articles. Moderate correlations between the GB and BKCI citation counts in social sciences and humanities, with most BKCI results coming from journal articles rather than books, suggests that they could measure the different aspects of impact, however.