Search (275 results, page 1 of 14)

Shibata, N.; Kajikawa, Y.; Sakata, I.: Link prediction in citation networks (2012) 0.17

0.17057168 = product of:
  0.22742891 = sum of:
    0.0070626684 = product of:
      0.028250674 = sum of:
        0.028250674 = weight(_text_:based in 4964) [ClassicSimilarity], result of:
          0.028250674 = score(doc=4964,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.19973516 = fieldWeight in 4964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=4964)
      0.25 = coord(1/4)
    0.06775281 = weight(_text_:term in 4964) [ClassicSimilarity], result of:
      0.06775281 = score(doc=4964,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.309317 = fieldWeight in 4964, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.046875 = fieldNorm(doc=4964)
    0.15261345 = weight(_text_:frequency in 4964) [ClassicSimilarity], result of:
      0.15261345 = score(doc=4964,freq=4.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.55206984 = fieldWeight in 4964, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=4964)
  0.75 = coord(3/4)

Abstract: In this article, we build models to predict the existence of citations among papers by formulating link prediction for 5 large-scale datasets of citation networks. The supervised machine-learning model is applied with 11 features. As a result, our learner performs very well, with the F1 values of between 0.74 and 0.82. Three features in particular, link-based Jaccard coefficient difference in betweenness centrality, and cosine similarity of term frequency-inverse document frequency vectors, largely affect the predictions of citations. The results also indicate that different models are required for different types of research areas-research fields with a single issue or research fields with multiple issues. In the case of research fields with multiple issues, there are barriers among research fields because our results indicate that papers tend to be cited in each research field locally. Therefore, one must consider the typology of targeted research areas when building models for link prediction in citation networks.

Ni, C.; Shaw, D.; Lind, S.M.; Ding, Y.: Journal impact and proximity : an assessment using bibliographic features (2013) 0.12

0.1239981 = product of:
  0.1653308 = sum of:
    0.009988121 = product of:
      0.039952483 = sum of:
        0.039952483 = weight(_text_:based in 686) [ClassicSimilarity], result of:
          0.039952483 = score(doc=686,freq=4.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.28246817 = fieldWeight in 686, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=686)
      0.25 = coord(1/4)
    0.107914 = weight(_text_:frequency in 686) [ClassicSimilarity], result of:
      0.107914 = score(doc=686,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.39037234 = fieldWeight in 686, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=686)
    0.047428668 = product of:
      0.094857335 = sum of:
        0.094857335 = weight(_text_:assessment in 686) [ClassicSimilarity], result of:
          0.094857335 = score(doc=686,freq=2.0), product of:
            0.25917634 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.04694356 = queryNorm
            0.36599535 = fieldWeight in 686, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.046875 = fieldNorm(doc=686)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Journals in the Information Science & Library Science category of Journal Citation Reports (JCR) were compared using both bibliometric and bibliographic features. Data collected covered journal impact factor (JIF), number of issues per year, number of authors per article, longevity, editorial board membership, frequency of publication, number of databases indexing the journal, number of aggregators providing full-text access, country of publication, JCR categories, Dewey decimal classification, and journal statement of scope. Three features significantly correlated with JIF: number of editorial board members and number of JCR categories in which a journal is listed correlated positively; journal longevity correlated negatively with JIF. Coword analysis of journal descriptions provided a proximity clustering of journals, which differed considerably from the clusters based on editorial board membership. Finally, a multiple linear regression model was built to predict the JIF based on all the collected bibliographic features.

Haustein, S.; Sugimoto, C.; Larivière, V.: Social media in scholarly communication : Guest editorial (2015) 0.12
```
0.1164408 = product of:
  0.15525441 = sum of:
    0.00611645 = product of:
      0.0244658 = sum of:
        0.0244658 = weight(_text_:based in 3809) [ClassicSimilarity], result of:
          0.0244658 = score(doc=3809,freq=6.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.17297572 = fieldWeight in 3809, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3809)
      0.25 = coord(1/4)
    0.047908474 = weight(_text_:term in 3809) [ClassicSimilarity], result of:
      0.047908474 = score(doc=3809,freq=4.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.21872015 = fieldWeight in 3809, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3809)
    0.10122948 = sum of:
      0.08214887 = weight(_text_:assessment in 3809) [ClassicSimilarity], result of:
        0.08214887 = score(doc=3809,freq=6.0), product of:
          0.25917634 = queryWeight, product of:
            5.52102 = idf(docFreq=480, maxDocs=44218)
            0.04694356 = queryNorm
          0.31696132 = fieldWeight in 3809, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            5.52102 = idf(docFreq=480, maxDocs=44218)
            0.0234375 = fieldNorm(doc=3809)
      0.019080611 = weight(_text_:22 in 3809) [ClassicSimilarity], result of:
        0.019080611 = score(doc=3809,freq=2.0), product of:
          0.16438834 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04694356 = queryNorm
          0.116070345 = fieldWeight in 3809, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0234375 = fieldNorm(doc=3809)
  0.75 = coord(3/4)
```
Abstract

One of the solutions to help scientists filter the most relevant publications and, thus, to stay current on developments in their fields during the transition from "little science" to "big science", was the introduction of citation indexing as a Wellsian "World Brain" (Garfield, 1964) of scientific information: It is too much to expect a research worker to spend an inordinate amount of time searching for the bibliographic descendants of antecedent papers. It would not be excessive to demand that the thorough scholar check all papers that have cited or criticized such papers, if they could be located quickly. The citation index makes this check practicable (Garfield, 1955, p. 108). In retrospective, citation indexing can be perceived as a pre-social web version of crowdsourcing, as it is based on the concept that the community of citing authors outperforms indexers in highlighting cognitive links between papers, particularly on the level of specific ideas and concepts (Garfield, 1983). Over the last 50 years, citation analysis and more generally, bibliometric methods, have developed from information retrieval tools to research evaluation metrics, where they are presumed to make scientific funding more efficient and effective (Moed, 2006). However, the dominance of bibliometric indicators in research evaluation has also led to significant goal displacement (Merton, 1957) and the oversimplification of notions of "research productivity" and "scientific quality", creating adverse effects such as salami publishing, honorary authorships, citation cartels, and misuse of indicators (Binswanger, 2015; Cronin and Sugimoto, 2014; Frey and Osterloh, 2006; Haustein and Larivière, 2015; Weingart, 2005).
Furthermore, the rise of the web, and subsequently, the social web, has challenged the quasi-monopolistic status of the journal as the main form of scholarly communication and citation indices as the primary assessment mechanisms. Scientific communication is becoming more open, transparent, and diverse: publications are increasingly open access; manuscripts, presentations, code, and data are shared online; research ideas and results are discussed and criticized openly on blogs; and new peer review experiments, with open post publication assessment by anonymous or non-anonymous referees, are underway. The diversification of scholarly production and assessment, paired with the increasing speed of the communication process, leads to an increased information overload (Bawden and Robinson, 2008), demanding new filters. The concept of altmetrics, short for alternative (to citation) metrics, was created out of an attempt to provide a filter (Priem et al., 2010) and to steer against the oversimplification of the measurement of scientific success solely on the basis of number of journal articles published and citations received, by considering a wider range of research outputs and metrics (Piwowar, 2013). Although the term altmetrics was introduced in a tweet in 2010 (Priem, 2010), the idea of capturing traces - "polymorphous mentioning" (Cronin et al., 1998, p. 1320) - of scholars and their documents on the web to measure "impact" of science in a broader manner than citations was introduced years before, largely in the context of webometrics (Almind and Ingwersen, 1997; Thelwall et al., 2005):
There will soon be a critical mass of web-based digital objects and usage statistics on which to model scholars' communication behaviors - publishing, posting, blogging, scanning, reading, downloading, glossing, linking, citing, recommending, acknowledging - and with which to track their scholarly influence and impact, broadly conceived and broadly felt (Cronin, 2005, p. 196). A decade after Cronin's prediction and five years after the coining of altmetrics, the time seems ripe to reflect upon the role of social media in scholarly communication. This Special Issue does so by providing an overview of current research on the indicators and metrics grouped under the umbrella term of altmetrics, on their relationships with traditional indicators of scientific activity, and on the uses that are made of the various social media platforms - on which these indicators are based - by scientists of various disciplines.

Date

20. 1.2015 18:30:22
Amolochitis, E.; Christou, I.T.; Tan, Z.-H.; Prasad, R.: ¬A heuristic hierarchical scheme for academic search and retrieval (2013) 0.11
```
0.11420593 = product of:
  0.15227456 = sum of:
    0.005885557 = product of:
      0.023542227 = sum of:
        0.023542227 = weight(_text_:based in 2711) [ClassicSimilarity], result of:
          0.023542227 = score(doc=2711,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.16644597 = fieldWeight in 2711, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2711)
      0.25 = coord(1/4)
    0.056460675 = weight(_text_:term in 2711) [ClassicSimilarity], result of:
      0.056460675 = score(doc=2711,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.25776416 = fieldWeight in 2711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2711)
    0.08992833 = weight(_text_:frequency in 2711) [ClassicSimilarity], result of:
      0.08992833 = score(doc=2711,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 2711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2711)
  0.75 = coord(3/4)
```
Abstract

We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper's index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed re-ranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographic rule (LEX) of ranking search results. In particular, PubSearch outperforms ACM Portal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.

Wang, F.; Wolfram, D.: Assessment of journal similarity based on citing discipline analysis (2015) 0.10

0.10150333 = product of:
  0.13533777 = sum of:
    0.005885557 = product of:
      0.023542227 = sum of:
        0.023542227 = weight(_text_:based in 1849) [ClassicSimilarity], result of:
          0.023542227 = score(doc=1849,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.16644597 = fieldWeight in 1849, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1849)
      0.25 = coord(1/4)
    0.08992833 = weight(_text_:frequency in 1849) [ClassicSimilarity], result of:
      0.08992833 = score(doc=1849,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 1849, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1849)
    0.039523892 = product of:
      0.079047784 = sum of:
        0.079047784 = weight(_text_:assessment in 1849) [ClassicSimilarity], result of:
          0.079047784 = score(doc=1849,freq=2.0), product of:
            0.25917634 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.04694356 = queryNorm
            0.30499613 = fieldWeight in 1849, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1849)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: This study compares the range of disciplines of citing journal articles to determine how closely related journals assigned to the same Web of Science research area are. The frequency distribution of disciplines by citing articles provides a signature for a cited journal that permits it to be compared with other journals using similarity comparison techniques. As an initial exploration, citing discipline data for 40 high-impact-factor journals assigned to the "information science and library science" category of the Web of Science were compared across 5 time periods. Similarity relationships were determined using multidimensional scaling and hierarchical cluster analysis to compare the outcomes produced by the proposed citing discipline and established cocitation methods. The maps and clustering outcomes reveal that a number of journals in allied areas of the information science and library science category may not be very closely related to each other or may not be appropriately situated in the category studied. The citing discipline similarity data resulted in similar outcomes with the cocitation data but with some notable differences. Because the citing discipline method relies on a citing perspective different from cocitations, it may provide a complementary way to compare journal similarity that is less labor intensive than cocitation analysis.

Zhao, D.; Strotmann, A.: Dimensions and uncertainties of author citation rankings : lessons learned from frequency-weighted in-text citation counting (2016) 0.10

0.096987605 = product of:
  0.19397521 = sum of:
    0.0070626684 = product of:
      0.028250674 = sum of:
        0.028250674 = weight(_text_:based in 2774) [ClassicSimilarity], result of:
          0.028250674 = score(doc=2774,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.19973516 = fieldWeight in 2774, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=2774)
      0.25 = coord(1/4)
    0.18691254 = weight(_text_:frequency in 2774) [ClassicSimilarity], result of:
      0.18691254 = score(doc=2774,freq=6.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.6761447 = fieldWeight in 2774, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=2774)
  0.5 = coord(2/4)

Abstract: In-text frequency-weighted citation counting has been seen as a particularly promising solution to the well-known problem of citation analysis that it treats all citations equally, be they crucial to the citing paper or perfunctory. But what is a good weighting scheme? We compare 12 different in-text citation frequency-weighting schemes in the field of library and information science (LIS) and explore author citation impact patterns based on their performance in these schemes. Our results show that the ranks of authors vary widely with different weighting schemes that favor or are biased against common citation impact patterns-substantiated, applied, or noted. These variations separate LIS authors quite clearly into groups with these impact patterns. With consensus rank limits, the hard upper and lower bounds for reasonable author ranks that they provide suggest that author citation ranks may be subject to something like an uncertainty principle.

Liu, X.; Zhang, J.; Guo, C.: Full-text citation analysis : a new method to enhance scholarly networks (2013) 0.09
```
0.09181927 = product of:
  0.18363854 = sum of:
    0.056460675 = weight(_text_:term in 1044) [ClassicSimilarity], result of:
      0.056460675 = score(doc=1044,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.25776416 = fieldWeight in 1044, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1044)
    0.12717786 = weight(_text_:frequency in 1044) [ClassicSimilarity], result of:
      0.12717786 = score(doc=1044,freq=4.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.46005818 = fieldWeight in 1044, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1044)
  0.5 = coord(2/4)
```
Abstract

In this article, we use innovative full-text citation analysis along with supervised topic modeling and network-analysis algorithms to enhance classical bibliometric analysis and publication/author/venue ranking. By utilizing citation contexts extracted from a large number of full-text publications, each citation or publication is represented by a probability distribution over a set of predefined topics, where each topic is labeled by an author-contributed keyword. We then used publication/citation topic distribution to generate a citation graph with vertex prior and edge transitioning probability distributions. The publication importance score for each given topic is calculated by PageRank with edge and vertex prior distributions. To evaluate this work, we sampled 104 topics (labeled with keywords) in review papers. The cited publications of each review paper are assumed to be "important publications" for the target topic (keyword), and we use these cited publications to validate our topic-ranking result and to compare different publication-ranking lists. Evaluation results show that full-text citation and publication content prior topic distribution, along with the classical PageRank algorithm can significantly enhance bibliometric analysis and scientific publication ranking performance, comparing with term frequency-inverted document frequency (tf-idf), language model, BM25, PageRank, and PageRank + language model (p < .001), for academic information retrieval (IR) systems.
Lievers, W.B.; Pilkey, A.K.: Characterizing the frequency of repeated citations : the effects of journal, subject area, and self-citation (2012) 0.08
```
0.082041934 = product of:
  0.16408387 = sum of:
    0.008323434 = product of:
      0.033293735 = sum of:
        0.033293735 = weight(_text_:based in 2725) [ClassicSimilarity], result of:
          0.033293735 = score(doc=2725,freq=4.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23539014 = fieldWeight in 2725, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2725)
      0.25 = coord(1/4)
    0.15576044 = weight(_text_:frequency in 2725) [ClassicSimilarity], result of:
      0.15576044 = score(doc=2725,freq=6.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.5634539 = fieldWeight in 2725, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2725)
  0.5 = coord(2/4)
```
Abstract

Previous studies have repeatedly demonstrated that the relevance of a citing document is related to the number of times with which the source document is cited. Despite the ease with which electronic documents would permit the incorporation of this information into citation-based document search and retrieval systems, the possibilities of repeated citations remain untapped. Part of this under-utilization may be due to the fact that very little is known regarding the pattern of repeated citations in scholarly literature or how this pattern may vary as a function of journal, academic discipline or self-citation. The current research addresses these unanswered questions in order to facilitate the future incorporation of repeated citation information into document search and retrieval systems. Using data mining of electronic texts, the citation characteristics of nine different journals, covering the three different academic fields (economics, computing, and medicine & biology), were characterized. It was found that the frequency (f) with which a reference is cited N or more times within a document is consistent across the sampled journals and academic fields. Self-citation causes an increase in frequency, and this effect becomes more pronounced for large N. The objectivity, automatability, and insensitivity of repeated citations to journal and discipline, present powerful opportunities for improving citation-based document search.

Zhao, D.; Strotmann, A.; Cappello, A.: In-text function of author self-citations : implications for research evaluation practice (2018) 0.08

0.07983806 = product of:
  0.15967612 = sum of:
    0.0070626684 = product of:
      0.028250674 = sum of:
        0.028250674 = weight(_text_:based in 4347) [ClassicSimilarity], result of:
          0.028250674 = score(doc=4347,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.19973516 = fieldWeight in 4347, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=4347)
      0.25 = coord(1/4)
    0.15261345 = weight(_text_:frequency in 4347) [ClassicSimilarity], result of:
      0.15261345 = score(doc=4347,freq=4.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.55206984 = fieldWeight in 4347, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=4347)
  0.5 = coord(2/4)

Abstract: Author self-citations were examined as to their function, frequency, and location in the full text of research articles and compared with external citations. Function analysis was based on manual coding of a small dataset in the field of library and information studies, whereas the analyses by frequency and location used both this small dataset and a large dataset from PubMed Central. Strong evidence was found that self-citations appear more likely to serve as substantial citations in a text than do external citations. This finding challenges previous studies that assumed that self-citations should be discounted or even removed and suggests that self-citations should be given more weight in citation analysis, if anything.

Marx, W.; Bornmann, L.; Barth, A.; Leydesdorff, L.: Detecting the historical roots of research fields by reference publication year spectroscopy (RPYS) (2014) 0.07

0.067069724 = product of:
  0.13413945 = sum of:
    0.00823978 = product of:
      0.03295912 = sum of:
        0.03295912 = weight(_text_:based in 1238) [ClassicSimilarity], result of:
          0.03295912 = score(doc=1238,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23302436 = fieldWeight in 1238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1238)
      0.25 = coord(1/4)
    0.12589967 = weight(_text_:frequency in 1238) [ClassicSimilarity], result of:
      0.12589967 = score(doc=1238,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.45543438 = fieldWeight in 1238, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1238)
  0.5 = coord(2/4)

Abstract: We introduce the quantitative method named "Reference Publication Year Spectroscopy" (RPYS). With this method one can determine the historical roots of research fields and quantify their impact on current research. RPYS is based on the analysis of the frequency with which references are cited in the publications of a specific research field in terms of the publication years of these cited references. The origins show up in the form of more or less pronounced peaks mostly caused by individual publications that are cited particularly frequently. In this study, we use research on graphene and on solar cells to illustrate how RPYS functions, and what results it can deliver.

Shah, T.A.; Gul, S.; Gaur, R.C.: Authors self-citation behaviour in the field of Library and Information Science (2015) 0.06
```
0.05865006 = product of:
  0.07820008 = sum of:
    0.00411989 = product of:
      0.01647956 = sum of:
        0.01647956 = weight(_text_:based in 2597) [ClassicSimilarity], result of:
          0.01647956 = score(doc=2597,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.11651218 = fieldWeight in 2597, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2597)
      0.25 = coord(1/4)
    0.062949836 = weight(_text_:frequency in 2597) [ClassicSimilarity], result of:
      0.062949836 = score(doc=2597,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.22771719 = fieldWeight in 2597, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2597)
    0.011130357 = product of:
      0.022260714 = sum of:
        0.022260714 = weight(_text_:22 in 2597) [ClassicSimilarity], result of:
          0.022260714 = score(doc=2597,freq=2.0), product of:
            0.16438834 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04694356 = queryNorm
            0.1354154 = fieldWeight in 2597, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2597)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

Purpose The purpose of this paper is to analyse the author self-citation behavior in the field of Library and Information Science. Various factors governing the author self-citation behavior have also been studied. Design/methodology/approach The 2012 edition of Social Science Citation Index was consulted for the selection of LIS journals. Under the subject heading "Information Science and Library Science" there were 84 journals and out of these 12 journals were selected for the study based on systematic sampling. The study was confined to original research and review articles that were published in select journals in the year 2009. The main reason to choose 2009 was to get at least five years (2009-2013) citation data from Web of Science Core Collection (excluding Book Citation Index) and SciELO Citation Index. A citation was treated as self-citation whenever one of the authors of citing and cited paper was common, i.e., the set of co-authors of the citing paper and that of the cited one are not disjoint. To minimize the risk of homonyms, spelling variances and misspelling in authors' names, the authors compared full author names in citing and cited articles. Findings A positive correlation between number of authors and total number of citations exists with no correlation between number of authors and number/share of self-citations, i.e., self-citations are not affected by the number of co-authors in a paper. Articles which are produced in collaboration attract more self-citations than articles produced by only one author. There is no statistically significant variation in citations counts (total and self-citations) in works that are result of different types of collaboration. A strong and statistically significant positive correlation exists between total citation count and frequency of self-citations. No relation could be ascertained between total citation count and proportion of self-citations. Authors tend to cite more of their recent works than the work of other authors. Total citation count and number of self-citations are positively correlated with the impact factor of source publication and correlation coefficient for total citations is much higher than that for self-citations. A negative correlation exhibits between impact factor and the share of self-citations. Of particular note is that the correlation in all the cases is of weak nature. Research limitations/implications The research provides an understanding of the author self-citations in the field of LIS. readers are encouraged to further the study by taking into account large sample, tracing citations also from Book Citation Index (WoS) and comparing results with other allied subjects so as to validate the robustness of the findings of this study. Originality/value Readers are encouraged to further the study by taking into account large sample, tracing citations also from Book Citation Index (WoS) and comparing results with other allied subjects so as to validate the robustness of the findings of this study.

Date

20. 1.2015 18:30:22

Zhao, R.; Wu, S.: ¬The network pattern of journal knowledge transfer in library and information science in China (2014) 0.06

0.057488333 = product of:
  0.11497667 = sum of:
    0.0070626684 = product of:
      0.028250674 = sum of:
        0.028250674 = weight(_text_:based in 1392) [ClassicSimilarity], result of:
          0.028250674 = score(doc=1392,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.19973516 = fieldWeight in 1392, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=1392)
      0.25 = coord(1/4)
    0.107914 = weight(_text_:frequency in 1392) [ClassicSimilarity], result of:
      0.107914 = score(doc=1392,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.39037234 = fieldWeight in 1392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=1392)
  0.5 = coord(2/4)

Abstract: Using the library and information science journals 2003-2012 in Nanjing University's Chinese Social Sciences Citation Index as data sources, the paper reveals the citation structure implied in these journals by applying social network analysis. Results show that, first, journal knowledge transfer activity in library and information science is frequent, and both the level of knowledge and discipline integration as well as the knowledge gap influenced knowledge transfer activity. According to the out-degree and in-degree, journals can be divided into three kinds. Second, based on professional bias and citation frequency, the knowledge transfer network can be divided into four blocks. With the change of discipline capacity and knowledge gap among journals, the "core-periphery" structure of the knowledge transfer network is getting weaker. Finally, regions of the knowledge transfer network evolved from a "weak-weak" subgroup to a "strong-weak" subgroup or a "weak-strong" subgroup, and then move to a "strong-strong" subgroup.

Ferrer-i-Cancho, R.; Vitevitch, M.S.: ¬The origins of Zipf's meaning-frequency law (2018) 0.05
```
0.053957 = product of:
  0.215828 = sum of:
    0.215828 = weight(_text_:frequency in 4546) [ClassicSimilarity], result of:
      0.215828 = score(doc=4546,freq=8.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.7807447 = fieldWeight in 4546, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.046875 = fieldNorm(doc=4546)
  0.25 = coord(1/4)
```
Abstract

In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning-frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration.
Li, R.; Chambers, T.; Ding, Y.; Zhang, G.; Meng, L.: Patent citation analysis : calculating science linkage based on citing motivation (2014) 0.05
```
0.05084972 = product of:
  0.10169944 = sum of:
    0.011771114 = product of:
      0.047084454 = sum of:
        0.047084454 = weight(_text_:based in 1257) [ClassicSimilarity], result of:
          0.047084454 = score(doc=1257,freq=8.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.33289194 = fieldWeight in 1257, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1257)
      0.25 = coord(1/4)
    0.08992833 = weight(_text_:frequency in 1257) [ClassicSimilarity], result of:
      0.08992833 = score(doc=1257,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 1257, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1257)
  0.5 = coord(2/4)
```
Abstract

Science linkage is a widely used patent bibliometric indicator to measure patent linkage to scientific research based on the frequency of citations to scientific papers within the patent. Science linkage is also regarded as noisy because the subject of patent citation behavior varies from inventors/applicants to examiners. In order to identify and ultimately reduce this noise, we analyzed the different citing motivations of examiners and inventors/applicants. We built 4 hypotheses based upon our study of patent law, the unique economic nature of a patent, and a patent citation's market effect. To test our hypotheses, we conducted an expert survey based on our science linkage calculation in the domain of catalyst from U.S. patent data (2006-2009) over 3 types of citations: self-citation by inventor/applicant, non-self-citation by inventor/applicant, and citation by examiner. According to our results, evaluated by domain experts, we conclude that the non-self-citation by inventor/applicant is quite noisy and cannot indicate science linkage and that self-citation by inventor/applicant, although limited, is more appropriate for understanding science linkage.
Sedhai, S.; Sun, A.: ¬An analysis of 14 Million tweets on hashtag-oriented spamming* (2017) 0.05
```
0.05084972 = product of:
  0.10169944 = sum of:
    0.011771114 = product of:
      0.047084454 = sum of:
        0.047084454 = weight(_text_:based in 3683) [ClassicSimilarity], result of:
          0.047084454 = score(doc=3683,freq=8.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.33289194 = fieldWeight in 3683, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3683)
      0.25 = coord(1/4)
    0.08992833 = weight(_text_:frequency in 3683) [ClassicSimilarity], result of:
      0.08992833 = score(doc=3683,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 3683, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3683)
  0.5 = coord(2/4)
```
Abstract

Over the years, Twitter has become a popular platform for information dissemination and information gathering. However, the popularity of Twitter has attracted not only legitimate users but also spammers who exploit social graphs, popular keywords, and hashtags for malicious purposes. In this paper, we present a detailed analysis of the HSpam14 dataset, which contains 14 million tweets with spam and ham (i.e., nonspam) labels, to understand spamming activities on Twitter. The primary focus of this paper is to analyze various aspects of spam on Twitter based on hashtags, tweet contents, and user profiles, which are useful for both tweet-level and user-level spam detection. First, we compare the usage of hashtags in spam and ham tweets based on frequency, position, orthography, and co-occurrence. Second, for content-based analysis, we analyze the variations in word usage, metadata, and near-duplicate tweets. Third, for user-based analysis, we investigate user profile information. In our study, we validate that spammers use popular hashtags to promote their tweets. We also observe differences in the usage of words in spam and ham tweets. Spam tweets are more likely to be emphasized using exclamation points and capitalized words. Furthermore, we observe that spammers use multiple accounts to post near-duplicate tweets to promote their services and products. Unlike spammers, legitimate users are likely to provide more information such as their locations and personal descriptions in their profiles. In summary, this study presents a comprehensive analysis of hashtags, tweet contents, and user profiles in Twitter spamming.

Prathap, G.: Fractionalized exergy for evaluating research performance (2011) 0.05

0.049876988 = product of:
  0.099753976 = sum of:
    0.009416891 = product of:
      0.037667565 = sum of:
        0.037667565 = weight(_text_:based in 4918) [ClassicSimilarity], result of:
          0.037667565 = score(doc=4918,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.26631355 = fieldWeight in 4918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0625 = fieldNorm(doc=4918)
      0.25 = coord(1/4)
    0.09033708 = weight(_text_:term in 4918) [ClassicSimilarity], result of:
      0.09033708 = score(doc=4918,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.41242266 = fieldWeight in 4918, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0625 = fieldNorm(doc=4918)
  0.5 = coord(2/4)

Abstract: The approach based on "thermodynamic" considerations that can quantify research performance using an exergy term defined as X = iC, where i is the impact and C is the number of citations is now extended to cases where fractionalized counting of citations is used instead of integer counting.

Wan, X.; Liu, F.: WL-index : leveraging citation mention number to quantify an individual's scientific impact (2014) 0.05
```
0.04912588 = product of:
  0.09825176 = sum of:
    0.008323434 = product of:
      0.033293735 = sum of:
        0.033293735 = weight(_text_:based in 1549) [ClassicSimilarity], result of:
          0.033293735 = score(doc=1549,freq=4.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23539014 = fieldWeight in 1549, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1549)
      0.25 = coord(1/4)
    0.08992833 = weight(_text_:frequency in 1549) [ClassicSimilarity], result of:
      0.08992833 = score(doc=1549,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 1549, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1549)
  0.5 = coord(2/4)
```
Abstract

A number of bibliometric indices have been developed to evaluate an individual's scientific impact, and the most popular are the h-index and its variants. However, existing bibliometric indices are computed based on the number of citations received by each article, but they do not consider the frequency with which individual citations are mentioned in an article. We use "citation mention" to denote a unique occurrence of a cited reference mentioned in the citing article, and thus some citations may have more than one mention in an article. According to our analysis of the ACL Anthology Network corpus in the natural language processing field, more than 40% of cited references have been mentioned twice or in corresponding citing articles. We argue that citation mention is a preferable for representing the citation relationships between articles, that is, a reference article mentioned m times in the citing article will be considered to have received m citations, rather than one citation. Based on this assumption, we revise the h-index and propose a new bibliometric index, the WL-index, to evaluation an individual's scientific impact. According to our empirical analysis, the proposed WL-index more accurately discriminates between program committee chairs of reputable conferences and ordinary authors.
Zubiaga, A.: ¬A longitudinal assessment of the persistence of twitter datasets (2018) 0.05
```
0.04799228 = product of:
  0.09598456 = sum of:
    0.056460675 = weight(_text_:term in 4368) [ClassicSimilarity], result of:
      0.056460675 = score(doc=4368,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.25776416 = fieldWeight in 4368, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4368)
    0.039523892 = product of:
      0.079047784 = sum of:
        0.079047784 = weight(_text_:assessment in 4368) [ClassicSimilarity], result of:
          0.079047784 = score(doc=4368,freq=2.0), product of:
            0.25917634 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.04694356 = queryNorm
            0.30499613 = fieldWeight in 4368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4368)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Social media datasets are not always completely replicable. Having to adhere to requirements of platforms such as Twitter, researchers can only release a list of unique identifiers, which others can then use to recollect the data themselves. This leads to subsets of the data no longer being available, as content can be deleted or user accounts deactivated. To quantify the long-term impact of this in the replicability of datasets, we perform a longitudinal analysis of the persistence of 30 Twitter datasets, which include more than 147 million tweets. By recollecting Twitter datasets ranging from 0 to 4 years old by using the tweet IDs, we look at four different factors quantifying the extent to which recollected datasets resemble original ones: completeness, representativity, similarity, and changingness. Although the ratio of available tweets keeps decreasing as the dataset gets older, we find that the textual content of the recollected subset is still largely representative of the original dataset. The representativity of the metadata, however, keeps fading over time, both because the dataset shrinks and because certain metadata, such as the users' number of followers, keeps changing. Our study has important implications for researchers sharing and using publicly shared Twitter datasets in their research.
Tu, Y.-N.; Hsu, S.-L.: Constructing conceptual trajectory maps to trace the development of research fields (2016) 0.05
```
0.047906943 = product of:
  0.095813885 = sum of:
    0.005885557 = product of:
      0.023542227 = sum of:
        0.023542227 = weight(_text_:based in 3059) [ClassicSimilarity], result of:
          0.023542227 = score(doc=3059,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.16644597 = fieldWeight in 3059, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3059)
      0.25 = coord(1/4)
    0.08992833 = weight(_text_:frequency in 3059) [ClassicSimilarity], result of:
      0.08992833 = score(doc=3059,freq=2.0), product of:
        0.27643865 = queryWeight, product of:
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.04694356 = queryNorm
        0.32531026 = fieldWeight in 3059, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.888745 = idf(docFreq=332, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3059)
  0.5 = coord(2/4)
```
Abstract

This study proposes a new method to construct and trace the trajectory of conceptual development of a research field by combining main path analysis, citation analysis, and text-mining techniques. Main path analysis, a method used commonly to trace the most critical path in a citation network, helps describe the developmental trajectory of a research field. This study extends the main path analysis method and applies text-mining techniques in the new method, which reflects the trajectory of conceptual development in an academic research field more accurately than citation frequency, which represents only the articles examined. Articles can be merged based on similarity of concepts, and by merging concepts the history of a research field can be described more precisely. The new method was applied to the "h-index" and "text mining" fields. The precision, recall, and F-measures of the h-index were 0.738, 0.652, and 0.658 and those of text-mining were 0.501, 0.653, and 0.551, respectively. Last, this study not only establishes the conceptual trajectory map of a research field, but also recommends keywords that are more precise than those used currently by researchers. These precise keywords could enable researchers to gather related works more quickly than before.
Huang, M.-H.; Lin, C.-S.; Chen, D.-Z.: Counting methods, country rank changes, and counting inflation in the assessment of national research productivity and impact (2011) 0.04
```
0.043685608 = product of:
  0.087371215 = sum of:
    0.008323434 = product of:
      0.033293735 = sum of:
        0.033293735 = weight(_text_:based in 4942) [ClassicSimilarity], result of:
          0.033293735 = score(doc=4942,freq=4.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23539014 = fieldWeight in 4942, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4942)
      0.25 = coord(1/4)
    0.079047784 = product of:
      0.15809557 = sum of:
        0.15809557 = weight(_text_:assessment in 4942) [ClassicSimilarity], result of:
          0.15809557 = score(doc=4942,freq=8.0), product of:
            0.25917634 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.04694356 = queryNorm
            0.60999227 = fieldWeight in 4942, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4942)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The counting of papers and citations is fundamental to the assessment of research productivity and impact. In an age of increasing scientific collaboration across national borders, the counting of papers produced by collaboration between multiple countries, and citations of such papers, raises concerns in country-level research evaluation. In this study, we compared the number counts and country ranks resulting from five different counting methods. We also observed inflation depending on the method used. Using the 1989 to 2008 physics papers indexed in ISI's Web of Science as our sample, we analyzed the counting results in terms of paper count (research productivity) as well as citation count and citation-paper ratio (CP ratio) based evaluation (research impact). The results show that at the country-level assessment, the selection of counting method had only minor influence on the number counts and country rankings in each assessment. However, the influences of counting methods varied between paper count, citation count, and CP ratio based evaluation. The findings also suggest that the popular counting method (whole counting) that gives each collaborating country one full credit may not be the best counting method. Straight counting that accredits only the first or the corresponding author or fractional counting that accredits each collaborator with partial and weighted credit might be the better choices.

Search (275 results, page 1 of 14)

Authors

Languages

Themes