Search (6 results, page 1 of 1)

Sjögårde, P.; Ahlgren, P.; Waltman, L.: Algorithmic labeling in hierarchical classifications of publications : evaluation of bibliographic fields and term weighting approaches (2021) 0.01
```
0.013142297 = product of:
  0.026284594 = sum of:
    0.008582841 = weight(_text_:information in 261) [ClassicSimilarity], result of:
      0.008582841 = score(doc=261,freq=2.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.09697737 = fieldWeight in 261, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=261)
    0.017701752 = product of:
      0.035403505 = sum of:
        0.035403505 = weight(_text_:organization in 261) [ClassicSimilarity], result of:
          0.035403505 = score(doc=261,freq=2.0), product of:
            0.17974974 = queryWeight, product of:
              3.5653565 = idf(docFreq=3399, maxDocs=44218)
              0.050415643 = queryNorm
            0.19695997 = fieldWeight in 261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5653565 = idf(docFreq=3399, maxDocs=44218)
              0.0390625 = fieldNorm(doc=261)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Algorithmic classifications of research publications can be used to study many different aspects of the science system, such as the organization of science into fields, the growth of fields, interdisciplinarity, and emerging topics. How to label the classes in these classifications is a problem that has not been thoroughly addressed in the literature. In this study, we evaluate different approaches to label the classes in algorithmically constructed classifications of research publications. We focus on two important choices: the choice of (a) different bibliographic fields and (b) different approaches to weight the relevance of terms. To evaluate the different choices, we created two baselines: one based on the Medical Subject Headings in MEDLINE and another based on the Science-Metrix journal classification. We tested to what extent different approaches yield the desired labels for the classes in the two baselines. Based on our results, we recommend extracting terms from titles and keywords to label classes at high levels of granularity (e.g., topics). At low levels of granularity (e.g., disciplines) we recommend extracting terms from journal names and author addresses. We recommend the use of a new approach, term frequency to specificity ratio, to calculate the relevance of terms.

Source

Journal of the Association for Information Science and Technology. 72(2021) no.7, S.853-869
Ahlgren, P.; Jarneving, B.; Rousseau, R.: Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient (2003) 0.01
```
0.011685811 = product of:
  0.023371622 = sum of:
    0.009710376 = weight(_text_:information in 5171) [ClassicSimilarity], result of:
      0.009710376 = score(doc=5171,freq=4.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.10971737 = fieldWeight in 5171, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=5171)
    0.013661247 = product of:
      0.027322493 = sum of:
        0.027322493 = weight(_text_:22 in 5171) [ClassicSimilarity], result of:
          0.027322493 = score(doc=5171,freq=2.0), product of:
            0.17654699 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050415643 = queryNorm
            0.15476047 = fieldWeight in 5171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5171)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Ahlgren, Jarneving, and. Rousseau review accepted procedures for author co-citation analysis first pointing out that since in the raw data matrix the row and column values are identical i,e, the co-citation count of two authors, there is no clear choice for diagonal values. They suggest the number of times an author has been co-cited with himself excluding self citation rather than the common treatment as zeros or as missing values. When the matrix is converted to a similarity matrix the normal procedure is to create a matrix of Pearson's r coefficients between data vectors. Ranking by r and by co-citation frequency and by intuition can easily yield three different orders. It would seem necessary that the adding of zeros to the matrix will not affect the value or the relative order of similarity measures but it is shown that this is not the case with Pearson's r. Using 913 bibliographic descriptions form the Web of Science of articles form JASIS and Scientometrics, authors names were extracted, edited and 12 information retrieval authors and 12 bibliometric authors each from the top 100 most cited were selected. Co-citation and r value (diagonal elements treated as missing) matrices were constructed, and then reconstructed in expanded form. Adding zeros can both change the r value and the ordering of the authors based upon that value. A chi-squared distance measure would not violate these requirements, nor would the cosine coefficient. It is also argued that co-citation data is ordinal data since there is no assurance of an absolute zero number of co-citations, and thus Pearson is not appropriate. The number of ties in co-citation data make the use of the Spearman rank order coefficient problematic.

Date

9. 7.2006 10:22:35

Source

Journal of the American Society for Information Science and technology. 54(2003) no.6, S.549-568

Ahlgren, P.; Järvelin, K.: Measuring impact of twelve information scientists using the DCI index (2010) 0.00

0.0036413912 = product of:
  0.014565565 = sum of:
    0.014565565 = weight(_text_:information in 3593) [ClassicSimilarity], result of:
      0.014565565 = score(doc=3593,freq=4.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.16457605 = fieldWeight in 3593, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3593)
  0.25 = coord(1/4)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.7, S.1424-1439

Ahlgren, P.; Grönqvist, L.: Evaluation of retrieval effectiveness with incomplete relevance data : theoretical and experimental comparison of three measures (2008) 0.00

0.0030039945 = product of:
  0.012015978 = sum of:
    0.012015978 = weight(_text_:information in 2032) [ClassicSimilarity], result of:
      0.012015978 = score(doc=2032,freq=2.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.13576832 = fieldWeight in 2032, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2032)
  0.25 = coord(1/4)

Source: Information processing and management. 44(2008) no.1, S.212-225

Ahlgren, P.; Kekäläinen, J.: Indexing strategies for Swedish full text retrieval under different user scenarios (2007) 0.00

0.0021457102 = product of:
  0.008582841 = sum of:
    0.008582841 = weight(_text_:information in 896) [ClassicSimilarity], result of:
      0.008582841 = score(doc=896,freq=2.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.09697737 = fieldWeight in 896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=896)
  0.25 = coord(1/4)

Source: Information processing and management. 43(2007) no.1, S.81-102

Ahlgren, P.; Colliander, C.; Sjögårde, P.: Exploring the relation between referencing practices and citation impact : a large-scale study based on Web of Science data (2018) 0.00

0.0021457102 = product of:
  0.008582841 = sum of:
    0.008582841 = weight(_text_:information in 4250) [ClassicSimilarity], result of:
      0.008582841 = score(doc=4250,freq=2.0), product of:
        0.08850355 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.050415643 = queryNorm
        0.09697737 = fieldWeight in 4250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4250)
  0.25 = coord(1/4)

Source: Journal of the Association for Information Science and Technology. 69(2018) no.5, S.728-743

Search (6 results, page 1 of 1)

Authors

Years

Themes