Search (29 results, page 1 of 2)

  • × author_ss:"Thelwall, M."
  • × year_i:[2010 TO 2020}
  1. Thelwall, M.; Buckley, K.; Paltoglou, G.: Sentiment strength detection for the social web (2012) 0.06
    0.056814235 = product of:
      0.1704427 = sum of:
        0.051356614 = weight(_text_:web in 4972) [ClassicSimilarity], result of:
          0.051356614 = score(doc=4972,freq=12.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.4416067 = fieldWeight in 4972, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
        0.029083263 = weight(_text_:world in 4972) [ClassicSimilarity], result of:
          0.029083263 = score(doc=4972,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.21233483 = fieldWeight in 4972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
        0.038646206 = weight(_text_:wide in 4972) [ClassicSimilarity], result of:
          0.038646206 = score(doc=4972,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.24476713 = fieldWeight in 4972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
        0.051356614 = weight(_text_:web in 4972) [ClassicSimilarity], result of:
          0.051356614 = score(doc=4972,freq=12.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.4416067 = fieldWeight in 4972, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4972)
      0.33333334 = coord(4/12)
    
    Abstract
    Sentiment analysis is concerned with the automatic extraction of sentiment-related information from text. Although most sentiment analysis addresses commercial tasks, such as extracting opinions from product reviews, there is increasing interest in the affective dimension of the social web, and Twitter in particular. Most sentiment analysis algorithms are not ideally suited to this task because they exploit indirect indicators of sentiment that can reflect genre or topic instead. Hence, such algorithms used to process social web texts can identify spurious sentiment patterns caused by topics rather than affective phenomena. This article assesses an improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment. The results from six diverse social web data sets (MySpace, Twitter, YouTube, Digg, Runners World, BBC Forums) indicate that SentiStrength 2 is successful in the sense of performing better than a baseline approach for all data sets in both supervised and unsupervised cases. SentiStrength is not always better than machine-learning approaches that exploit indirect indicators of sentiment, however, and is particularly weaker for positive sentiment in news-related discussions. Overall, the results suggest that, even unsupervised, SentiStrength is robust enough to be applied to a wide variety of different social web contexts.
  2. Thelwall, M.; Klitkou, A.; Verbeek, A.; Stuart, D.; Vincent, C.: Policy-relevant Webometrics for individual scientific fields (2010) 0.02
    0.024173612 = product of:
      0.09669445 = sum of:
        0.025159499 = weight(_text_:web in 3574) [ClassicSimilarity], result of:
          0.025159499 = score(doc=3574,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 3574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3574)
        0.046375446 = weight(_text_:wide in 3574) [ClassicSimilarity], result of:
          0.046375446 = score(doc=3574,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.29372054 = fieldWeight in 3574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=3574)
        0.025159499 = weight(_text_:web in 3574) [ClassicSimilarity], result of:
          0.025159499 = score(doc=3574,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 3574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3574)
      0.25 = coord(3/12)
    
    Abstract
    Despite over 10 years of research there is no agreement on the most suitable roles for Webometric indicators in support of research policy and almost no field-based Webometrics. This article partly fills these gaps by analyzing the potential of policy-relevant Webometrics for individual scientific fields with the help of 4 case studies. Although Webometrics cannot provide robust indicators of knowledge flows or research impact, it can provide some evidence of networking and mutual awareness. The scope of Webometrics is also relatively wide, including not only research organizations and firms but also intermediary groups like professional associations, Web portals, and government agencies. Webometrics can, therefore, provide evidence about the research process to compliment peer review, bibliometric, and patent indicators: tracking the early, mainly prepublication development of new fields and research funding initiatives, assessing the role and impact of intermediary organizations and the need for new ones, and monitoring the extent of mutual awareness in particular research areas.
  3. Thelwall, M.: Web indicators for research evaluation : a practical guide (2016) 0.02
    0.01976717 = product of:
      0.11860302 = sum of:
        0.05930151 = weight(_text_:web in 3384) [ClassicSimilarity], result of:
          0.05930151 = score(doc=3384,freq=16.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.5099235 = fieldWeight in 3384, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3384)
        0.05930151 = weight(_text_:web in 3384) [ClassicSimilarity], result of:
          0.05930151 = score(doc=3384,freq=16.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.5099235 = fieldWeight in 3384, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3384)
      0.16666667 = coord(2/12)
    
    Abstract
    In recent years there has been an increasing demand for research evaluation within universities and other research-based organisations. In parallel, there has been an increasing recognition that traditional citation-based indicators are not able to reflect the societal impacts of research and are slow to appear. This has led to the creation of new indicators for different types of research impact as well as timelier indicators, mainly derived from the Web. These indicators have been called altmetrics, webometrics or just web metrics. This book describes and evaluates a range of web indicators for aspects of societal or scholarly impact, discusses the theory and practice of using and evaluating web indicators for research assessment and outlines practical strategies for obtaining many web indicators. In addition to describing impact indicators for traditional scholarly outputs, such as journal articles and monographs, it also covers indicators for videos, datasets, software and other non-standard scholarly outputs. The book describes strategies to analyse web indicators for individual publications as well as to compare the impacts of groups of publications. The practical part of the book includes descriptions of how to use the free software Webometric Analyst to gather and analyse web data. This book is written for information science undergraduate and Master?s students that are learning about alternative indicators or scientometrics as well as Ph.D. students and other researchers and practitioners using indicators to help assess research impact or to study scholarly communication.
  4. Kousha, K.; Thelwall, M.; Rezaie, S.: Can the impact of scholarly images be assessed online? : an exploratory study using image identification technology (2010) 0.02
    0.01775394 = product of:
      0.07101576 = sum of:
        0.02096625 = weight(_text_:web in 3966) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3966,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3966, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3966)
        0.029083263 = weight(_text_:world in 3966) [ClassicSimilarity], result of:
          0.029083263 = score(doc=3966,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.21233483 = fieldWeight in 3966, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3966)
        0.02096625 = weight(_text_:web in 3966) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3966,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3966, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3966)
      0.25 = coord(3/12)
    
    Abstract
    The web contains a huge number of digital pictures. For scholars publishing such images it is important to know how well used their images are, but no method seems to have been developed for monitoring the value of academic images. In particular, can the impact of scientific or artistic images be assessed through identifying images copied or reused on the Internet? This article explores a case study of 260 NASA images to investigate whether the TinEye search engine could theoretically help to provide this information. The results show that the selected pictures had a median of 11 online copies each. However, a classification of 210 of these copies reveals that only 1.4% were explicitly used in academic publications, reflecting research impact, and the majority of the NASA pictures were used for informal scholarly (or educational) communication (37%). Additional analyses of world famous paintings and scientific images about pathology and molecular structures suggest that image contents are important for the type and extent of image use. Although it is reasonable to use statistics derived from TinEye for assessing image reuse value, the extent of its image indexing is not known.
  5. Kousha, K.; Thelwall, M.: Disseminating research with web CV hyperlinks (2014) 0.02
    0.017118871 = product of:
      0.10271323 = sum of:
        0.051356614 = weight(_text_:web in 1331) [ClassicSimilarity], result of:
          0.051356614 = score(doc=1331,freq=12.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.4416067 = fieldWeight in 1331, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1331)
        0.051356614 = weight(_text_:web in 1331) [ClassicSimilarity], result of:
          0.051356614 = score(doc=1331,freq=12.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.4416067 = fieldWeight in 1331, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1331)
      0.16666667 = coord(2/12)
    
    Abstract
    Some curricula vitae (web CVs) of academics on the web, including homepages and publication lists, link to open-access (OA) articles, resources, abstracts in publishers' websites, or academic discussions, helping to disseminate research. To assess how common such practices are and whether they vary by discipline, gender, and country, the authors conducted a large-scale e-mail survey of astronomy and astrophysics, public health, environmental engineering, and philosophy across 15 European countries and analyzed hyperlinks from web CVs of academics. About 60% of the 2,154 survey responses reported having a web CV or something similar, and there were differences between disciplines, genders, and countries. A follow-up outlink analysis of 2,700 web CVs found that a third had at least one outlink to an OA target, typically a public eprint archive or an individual self-archived file. This proportion was considerably higher in astronomy (48%) and philosophy (37%) than in environmental engineering (29%) and public health (21%). There were also differences in linking to publishers' websites, resources, and discussions. Perhaps most important, however, the amount of linking to OA publications seems to be much lower than allowed by publishers and journals, suggesting that many opportunities for disseminating full-text research online are being missed, especially in disciplines without established repositories. Moreover, few academics seem to be exploiting their CVs to link to discussions, resources, or article abstracts, which seems to be another missed opportunity for publicizing research.
  6. Thelwall, M.; Buckley, K.: Topic-based sentiment analysis for the social web : the role of mood and issue-related words (2013) 0.02
    0.016773 = product of:
      0.100637995 = sum of:
        0.050318997 = weight(_text_:web in 1004) [ClassicSimilarity], result of:
          0.050318997 = score(doc=1004,freq=8.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.43268442 = fieldWeight in 1004, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1004)
        0.050318997 = weight(_text_:web in 1004) [ClassicSimilarity], result of:
          0.050318997 = score(doc=1004,freq=8.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.43268442 = fieldWeight in 1004, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1004)
      0.16666667 = coord(2/12)
    
    Abstract
    General sentiment analysis for the social web has become increasingly useful for shedding light on the role of emotion in online communication and offline events in both academic research and data journalism. Nevertheless, existing general-purpose social web sentiment analysis algorithms may not be optimal for texts focussed around specific topics. This article introduces 2 new methods, mood setting and lexicon extension, to improve the accuracy of topic-specific lexical sentiment strength detection for the social web. Mood setting allows the topic mood to determine the default polarity for ostensibly neutral expressive text. Topic-specific lexicon extension involves adding topic-specific words to the default general sentiment lexicon. Experiments with 8 data sets show that both methods can improve sentiment analysis performance in corpora and are recommended when the topic focus is tightest.
  7. Thelwall, M.: ¬A comparison of link and URL citation counting (2011) 0.01
    0.012104871 = product of:
      0.07262922 = sum of:
        0.03631461 = weight(_text_:web in 4533) [ClassicSimilarity], result of:
          0.03631461 = score(doc=4533,freq=6.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3122631 = fieldWeight in 4533, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4533)
        0.03631461 = weight(_text_:web in 4533) [ClassicSimilarity], result of:
          0.03631461 = score(doc=4533,freq=6.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3122631 = fieldWeight in 4533, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4533)
      0.16666667 = coord(2/12)
    
    Abstract
    Purpose - Link analysis is an established topic within webometrics. It normally uses counts of links between sets of web sites or to sets of web sites. These link counts are derived from web crawlers or commercial search engines with the latter being the only alternative for some investigations. This paper compares link counts with URL citation counts in order to assess whether the latter could be a replacement for the former if the major search engines withdraw their advanced hyperlink search facilities. Design/methodology/approach - URL citation counts are compared with link counts for a variety of data sets used in previous webometric studies. Findings - The results show a high degree of correlation between the two but with URL citations being much less numerous, at least outside academia and business. Research limitations/implications - The results cover a small selection of 15 case studies and so the findings are only indicative. Significant differences between results indicate that the difference between link counts and URL citation counts will vary between webometric studies. Practical implications - Should link searches be withdrawn, then link analyses of less well linked non-academic, non-commercial sites would be seriously weakened, although citations based on e-mail addresses could help to make citations more numerous than links for some business and academic contexts. Originality/value - This is the first systematic study of the difference between link counts and URL citation counts in a variety of contexts and it shows that there are significant differences between the two.
  8. Thelwall, M.: Assessing web search engines : a webometric approach (2011) 0.01
    0.011860303 = product of:
      0.071161814 = sum of:
        0.035580907 = weight(_text_:web in 10) [ClassicSimilarity], result of:
          0.035580907 = score(doc=10,freq=4.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3059541 = fieldWeight in 10, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=10)
        0.035580907 = weight(_text_:web in 10) [ClassicSimilarity], result of:
          0.035580907 = score(doc=10,freq=4.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3059541 = fieldWeight in 10, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=10)
      0.16666667 = coord(2/12)
    
    Abstract
    Information Retrieval (IR) research typically evaluates search systems in terms of the standard precision, recall and F-measures to weight the relative importance of precision and recall (e.g. van Rijsbergen, 1979). All of these assess the extent to which the system returns good matches for a query. In contrast, webometric measures are designed specifically for web search engines and are designed to monitor changes in results over time and various aspects of the internal logic of the way in which search engine select the results to be returned. This chapter introduces a range of webometric measurements and illustrates them with case studies of Google, Bing and Yahoo! This is a very fertile area for simple and complex new investigations into search engine results.
  9. Orduna-Malea, E.; Thelwall, M.; Kousha, K.: Web citations in patents : evidence of technological impact? (2017) 0.01
    0.011860303 = product of:
      0.071161814 = sum of:
        0.035580907 = weight(_text_:web in 3764) [ClassicSimilarity], result of:
          0.035580907 = score(doc=3764,freq=4.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3059541 = fieldWeight in 3764, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3764)
        0.035580907 = weight(_text_:web in 3764) [ClassicSimilarity], result of:
          0.035580907 = score(doc=3764,freq=4.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.3059541 = fieldWeight in 3764, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3764)
      0.16666667 = coord(2/12)
    
    Abstract
    Patents sometimes cite webpages either as general background to the problem being addressed or to identify prior publications that limit the scope of the patent granted. Counts of the number of patents citing an organization's website may therefore provide an indicator of its technological capacity or relevance. This article introduces methods to extract URL citations from patents and evaluates the usefulness of counts of patent web citations as a technology indicator. An analysis of patents citing 200 US universities or 177 UK universities found computer science and engineering departments to be frequently cited, as well as research-related webpages, such as Wikipedia, YouTube, or the Internet Archive. Overall, however, patent URL citations seem to be frequent enough to be useful for ranking major US and the top few UK universities if popular hosted subdomains are filtered out, but the hit count estimates on the first search engine results page should not be relied upon for accuracy.
  10. Thelwall, M.; Buckley, K.; Paltoglou, G.; Cai, D.; Kappas, A.: Sentiment strength detection in short informal text (2010) 0.01
    0.008452717 = product of:
      0.050716303 = sum of:
        0.038646206 = weight(_text_:wide in 4200) [ClassicSimilarity], result of:
          0.038646206 = score(doc=4200,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.24476713 = fieldWeight in 4200, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4200)
        0.012070097 = product of:
          0.024140194 = sum of:
            0.024140194 = weight(_text_:22 in 4200) [ClassicSimilarity], result of:
              0.024140194 = score(doc=4200,freq=2.0), product of:
                0.12478739 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.035634913 = queryNorm
                0.19345059 = fieldWeight in 4200, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4200)
          0.5 = coord(1/2)
      0.16666667 = coord(2/12)
    
    Abstract
    A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.
    Date
    22. 1.2011 14:29:23
  11. Didegah, F.; Thelwall, M.: Determinants of research citation impact in nanoscience and nanotechnology (2013) 0.01
    0.0083865 = product of:
      0.050318997 = sum of:
        0.025159499 = weight(_text_:web in 737) [ClassicSimilarity], result of:
          0.025159499 = score(doc=737,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=737)
        0.025159499 = weight(_text_:web in 737) [ClassicSimilarity], result of:
          0.025159499 = score(doc=737,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=737)
      0.16666667 = coord(2/12)
    
    Abstract
    This study investigates a range of metrics available when a nanoscience and nanotechnology article is published to see which metrics correlate more with the number of citations to the article. It also introduces the degree of internationality of journals and references as new metrics for this purpose. The journal impact factor; the impact of references; the internationality of authors, journals, and references; and the number of authors, institutions, and references were all calculated for papers published in nanoscience and nanotechnology journals in the Web of Science from 2007 to 2009. Using a zero-inflated negative binomial regression model on the data set, the impact factor of the publishing journal and the citation impact of the cited references were found to be the most effective determinants of citation counts in all four time periods. In the entire 2007 to 2009 period, apart from journal internationality and author numbers and internationality, all other predictor variables had significant effects on citation counts.
  12. Kousha, K.; Thelwall, M.: Patent citation analysis with Google (2017) 0.01
    0.008009315 = product of:
      0.096111774 = sum of:
        0.096111774 = weight(_text_:filter in 3317) [ClassicSimilarity], result of:
          0.096111774 = score(doc=3317,freq=2.0), product of:
            0.24899386 = queryWeight, product of:
              6.987357 = idf(docFreq=110, maxDocs=44218)
              0.035634913 = queryNorm
            0.38600057 = fieldWeight in 3317, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.987357 = idf(docFreq=110, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3317)
      0.083333336 = coord(1/12)
    
    Abstract
    Citations from patents to scientific publications provide useful evidence about the commercial impact of academic research, but automatically searchable databases are needed to exploit this connection for large-scale patent citation evaluations. Google covers multiple different international patent office databases but does not index patent citations or allow automatic searches. In response, this article introduces a semiautomatic indirect method via Bing to extract and filter patent citations from Google to academic papers with an overall precision of 98%. The method was evaluated with 322,192 science and engineering Scopus articles from every second year for the period 1996-2012. Although manual Google Patent searches give more results, especially for articles with many patent citations, the difference is not large enough to be a major problem. Within Biomedical Engineering, Biotechnology, and Pharmacology & Pharmaceutics, 7% to 10% of Scopus articles had at least one patent citation but other fields had far fewer, so patent citation analysis is only relevant for a minority of publications. Low but positive correlations between Google Patent citations and Scopus citations across all fields suggest that traditional citation counts cannot substitute for patent citations when evaluating research.
  13. Thelwall, M.; Sud, P.: ¬A comparison of methods for collecting web citation data for academic organizations (2011) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 4626) [ClassicSimilarity], result of:
          0.02096625 = score(doc=4626,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 4626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4626)
        0.02096625 = weight(_text_:web in 4626) [ClassicSimilarity], result of:
          0.02096625 = score(doc=4626,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 4626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4626)
      0.16666667 = coord(2/12)
    
  14. Haustein, S.; Peters, I.; Sugimoto, C.R.; Thelwall, M.; Larivière, V.: Tweeting biomedicine : an analysis of tweets and citations in the biomedical literature (2014) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 1229) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1229,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1229, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1229)
        0.02096625 = weight(_text_:web in 1229) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1229,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1229, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1229)
      0.16666667 = coord(2/12)
    
    Abstract
    Data collected by social media platforms have been introduced as new sources for indicators to help measure the impact of scholarly research in ways that are complementary to traditional citation analysis. Data generated from social media activities can be used to reflect broad types of impact. This article aims to provide systematic evidence about how often Twitter is used to disseminate information about journal articles in the biomedical sciences. The analysis is based on 1.4 million documents covered by both PubMed and Web of Science and published between 2010 and 2012. The number of tweets containing links to these documents was analyzed and compared to citations to evaluate the degree to which certain journals, disciplines, and specialties were represented on Twitter and how far tweets correlate with citation impact. With less than 10% of PubMed articles mentioned on Twitter, its uptake is low in general but differs between journals and specialties. Correlations between tweets and citations are low, implying that impact metrics based on tweets are different from those based on citations. A framework using the coverage of articles and the correlation between Twitter mentions and citations is proposed to facilitate the evaluation of novel social-media-based metrics.
  15. Larivière, V.; Sugimoto, C.R.; Macaluso, B.; Milojevi´c, S.; Cronin, B.; Thelwall, M.: arXiv E-prints and the journal of record : an analysis of roles and relationships (2014) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 1285) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1285,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1285)
        0.02096625 = weight(_text_:web in 1285) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1285,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1285)
      0.16666667 = coord(2/12)
    
    Abstract
    Since its creation in 1991, arXiv has become central to the diffusion of research in a number of fields. Combining data from the entirety of arXiv and the Web of Science (WoS), this article investigates (a) the proportion of papers across all disciplines that are on arXiv and the proportion of arXiv papers that are in the WoS, (b) the elapsed time between arXiv submission and journal publication, and (c) the aging characteristics and scientific impact of arXiv e-prints and their published version. It shows that the proportion of WoS papers found on arXiv varies across the specialties of physics and mathematics, and that only a few specialties make extensive use of the repository. Elapsed time between arXiv submission and journal publication has shortened but remains longer in mathematics than in physics. In physics, mathematics, as well as in astronomy and astrophysics, arXiv versions are cited more promptly and decay faster than WoS papers. The arXiv versions of papers-both published and unpublished-have lower citation rates than published papers, although there is almost no difference in the impact of the arXiv versions of published and unpublished papers.
  16. Thelwall, M.; Maflahi, N.: Are scholarly articles disproportionately read in their own country? : An analysis of mendeley readers (2015) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 1850) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1850,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1850, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1850)
        0.02096625 = weight(_text_:web in 1850) [ClassicSimilarity], result of:
          0.02096625 = score(doc=1850,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 1850, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1850)
      0.16666667 = coord(2/12)
    
    Abstract
    International collaboration tends to result in more highly cited research and, partly as a result of this, many research funding schemes are specifically international in scope. Nevertheless, it is not clear whether this citation advantage is the result of higher quality research or due to other factors, such as a larger audience for the publications. To test whether the apparent advantage of internationally collaborative research may be due to additional interest in articles from the countries of the authors, this article assesses the extent to which the national affiliations of the authors of articles affect the national affiliations of their Mendeley readers. Based on English-language Web of Science articles in 10 fields from science, medicine, social science, and the humanities, the results of statistical models comparing author and reader affiliations suggest that, in most fields, Mendeley users are disproportionately readers of articles authored from within their own country. In addition, there are several cases in which Mendeley users from certain countries tend to ignore articles from specific other countries, although it is not clear whether this reflects national biases or different national specialisms within a field. In conclusion, research funders should not incentivize international collaboration on the basis that it is, in general, higher quality because its higher impact may be primarily due to its larger audience. Moreover, authors should guard against national biases in their reading to select only the best and most relevant publications to inform their research.
  17. Thelwall, M.; Kousha, K.: Goodreads : a social network site for book readers (2017) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 3534) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3534,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3534)
        0.02096625 = weight(_text_:web in 3534) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3534,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3534)
      0.16666667 = coord(2/12)
    
    Abstract
    Goodreads is an Amazon-owned book-based social web site for members to share books, read, review books, rate books, and connect with other readers. Goodreads has tens of millions of book reviews, recommendations, and ratings that may help librarians and readers to select relevant books. This article describes a first investigation of the properties of Goodreads users, using a random sample of 50,000 members. The results suggest that about three quarters of members with a public profile are female, and that there is little difference between male and female users in patterns of behavior, except for females registering more books and rating them less positively. Goodreads librarians and super-users engage extensively with most features of the site. The absence of strong correlations between book-based and social usage statistics (e.g., numbers of friends, followers, books, reviews, and ratings) suggests that members choose their own individual balance of social and book activities and rarely ignore one at the expense of the other. Goodreads is therefore neither primarily a book-based website nor primarily a social network site but is a genuine hybrid, social navigation site.
  18. Sud, P.; Thelwall, M.: Not all international collaboration is beneficial : the Mendeley readership and citation impact of biochemical research collaboration (2016) 0.01
    0.0055910004 = product of:
      0.033546 = sum of:
        0.016773 = weight(_text_:web in 3048) [ClassicSimilarity], result of:
          0.016773 = score(doc=3048,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.14422815 = fieldWeight in 3048, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=3048)
        0.016773 = weight(_text_:web in 3048) [ClassicSimilarity], result of:
          0.016773 = score(doc=3048,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.14422815 = fieldWeight in 3048, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=3048)
      0.16666667 = coord(2/12)
    
    Abstract
    This study aims to identify the way researchers collaborate with other researchers in the course of the scientific research life cycle and provide information to the designers of e-Science and e-Research implementations. On the basis of in-depth interviews with and on-site observations of 24 scientists and a follow-up focus group interview in the field of bioscience/nanoscience and technology in Korea, we examined scientific collaboration using the framework of the scientific research life cycle. We attempt to explain the major motiBiochemistry is a highly funded research area that is typified by large research teams and is important for many areas of the life sciences. This article investigates the citation impact and Mendeley readership impact of biochemistry research from 2011 in the Web of Science according to the type of collaboration involved. Negative binomial regression models are used that incorporate, for the first time, the inclusion of specific countries within a team. The results show that, holding other factors constant, larger teams robustly associate with higher impact research, but including additional departments has no effect and adding extra institutions tends to reduce the impact of research. Although international collaboration is apparently not advantageous in general, collaboration with the United States, and perhaps also with some other countries, seems to increase impact. In contrast, collaborations with some other nations seems to decrease impact, although both findings could be due to factors such as differing national proportions of excellent researchers. As a methodological implication, simpler statistical models would find international collaboration to be generally beneficial and so it is important to take into account specific countries when examining collaboration.t only in the beginning phase of the cycle. For communication and information-sharing practices, scientists continue to favor traditional means of communication for security reasons. Barriers to collaboration throughout the phases included different priorities, competitive tensions, and a hierarchical culture among collaborators, whereas credit sharing was a barrier in the research product phase.
  19. Kousha, K.; Thelwall, M.: Can Amazon.com reviews help to assess the wider impacts of books? (2016) 0.00
    0.0038646206 = product of:
      0.046375446 = sum of:
        0.046375446 = weight(_text_:wide in 2768) [ClassicSimilarity], result of:
          0.046375446 = score(doc=2768,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.29372054 = fieldWeight in 2768, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2768)
      0.083333336 = coord(1/12)
    
    Abstract
    Although citation counts are often used to evaluate the research impact of academic publications, they are problematic for books that aim for educational or cultural impact. To fill this gap, this article assesses whether a number of simple metrics derived from Amazon.com reviews of academic books could provide evidence of their impact. Based on a set of 2,739 academic monographs from 2008 and a set of 1,305 best-selling books in 15 Amazon.com academic subject categories, the existence of significant but low or moderate correlations between citations and numbers of reviews, combined with other evidence, suggests that online book reviews tend to reflect the wider popularity of a book rather than its academic impact, although there are substantial disciplinary differences. Metrics based on online reviews are therefore recommended for the evaluation of books that aim at a wide audience inside or outside academia when it is important to capture the broader impacts of educational or cultural activities and when they cannot be manipulated in advance of the evaluation.
  20. Kousha, K.; Thelwall, M.: News stories as evidence for research? : BBC citations from articles, Books, and Wikipedia (2017) 0.00
    0.0034274952 = product of:
      0.041129943 = sum of:
        0.041129943 = weight(_text_:world in 3760) [ClassicSimilarity], result of:
          0.041129943 = score(doc=3760,freq=4.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.30028677 = fieldWeight in 3760, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3760)
      0.083333336 = coord(1/12)
    
    Abstract
    Although news stories target the general public and are sometimes inaccurate, they can serve as sources of real-world information for researchers. This article investigates the extent to which academics exploit journalism using content and citation analyses of online BBC News stories cited by Scopus articles. A total of 27,234 Scopus-indexed publications have cited at least one BBC News story, with a steady annual increase. Citations from the arts and humanities (2.8% of publications in 2015) and social sciences (1.5%) were more likely than citations from medicine (0.1%) and science (<0.1%). Surprisingly, half of the sampled Scopus-cited science and technology (53%) and medicine and health (47%) stories were based on academic research, rather than otherwise unpublished information, suggesting that researchers have chosen a lower-quality secondary source for their citations. Nevertheless, the BBC News stories that were most frequently cited by Scopus, Google Books, and Wikipedia introduced new information from many different topics, including politics, business, economics, statistics, and reports about events. Thus, news stories are mediating real-world knowledge into the academic domain, a potential cause for concern.