Search (4 results, page 1 of 1)

  • × author_ss:"Boyack, K.W."
  • × language_ss:"e"
  • × theme_ss:"Informetrie"
  1. Boyack, K.W.; Small, H.; Klavans, R.: Improving the accuracy of co-citation clustering using full text (2013) 0.07
    0.0688152 = product of:
      0.2064456 = sum of:
        0.2064456 = weight(_text_:citation in 1036) [ClassicSimilarity], result of:
          0.2064456 = score(doc=1036,freq=16.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.8792412 = fieldWeight in 1036, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.046875 = fieldNorm(doc=1036)
      0.33333334 = coord(1/3)
    
    Abstract
    Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 full text documents from 2007, we compare the results of traditional co-citation clustering using only the bibliographic information to results from co-citation clustering where proximity between reference pairs is factored into the pairwise relationships. We find that accounting for reference proximity from full text can increase the textual coherence (a measure of accuracy) of a co-citation cluster solution by up to 30% over the traditional approach based on bibliographic information.
    Theme
    Citation indexing
  2. Boyack, K.W.; Klavans, R.: Co-citation analysis, bibliographic coupling, and direct citation : which citation approach represents the research front most accurately? (2010) 0.06
    0.060824625 = product of:
      0.18247387 = sum of:
        0.18247387 = weight(_text_:citation in 4111) [ClassicSimilarity], result of:
          0.18247387 = score(doc=4111,freq=18.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.7771468 = fieldWeight in 4111, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4111)
      0.33333334 = coord(1/3)
    
    Abstract
    In the past several years studies have started to appear comparing the accuracies of various science mapping approaches. These studies primarily compare the cluster solutions resulting from different similarity approaches, and give varying results. In this study we compare the accuracies of cluster solutions of a large corpus of 2,153,769 recent articles from the biomedical literature (2004-2008) using four similarity approaches: co-citation analysis, bibliographic coupling, direct citation, and a bibliographic coupling-based citation-text hybrid approach. Each of the four approaches can be considered a way to represent the research front in biomedicine, and each is able to successfully cluster over 92% of the corpus. Accuracies are compared using two metrics-within-cluster textual coherence as defined by the Jensen-Shannon divergence, and a concentration measure based on the grant-to-article linkages indexed in MEDLINE. Of the three pure citation-based approaches, bibliographic coupling slightly outperforms co-citation analysis using both accuracy measures; direct citation is the least accurate mapping approach by far. The hybrid approach improves upon the bibliographic coupling results in all respects. We consider the results of this study to be robust given the very large size of the corpus, and the specificity of the accuracy measures used.
  3. Klavans, K.; Boyack, K.W.: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? (2017) 0.06
    0.060824625 = product of:
      0.18247387 = sum of:
        0.18247387 = weight(_text_:citation in 3535) [ClassicSimilarity], result of:
          0.18247387 = score(doc=3535,freq=18.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.7771468 = fieldWeight in 3535, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3535)
      0.33333334 = coord(1/3)
    
    Abstract
    In 1965, Price foresaw the day when a citation-based taxonomy of science and technology would be delineated and correspondingly used for science policy. A taxonomy needs to be comprehensive and accurate if it is to be useful for policy making, especially now that policy makers are utilizing citation-based indicators to evaluate people, institutions and laboratories. Determining the accuracy of a taxonomy, however, remains a challenge. Previous work on the accuracy of partition solutions is sparse, and the results of those studies, although useful, have not been definitive. In this study we compare the accuracies of topic-level taxonomies based on the clustering of documents using direct citation, bibliographic coupling, and co-citation. Using a set of new gold standards-articles with at least 100 references-we find that direct citation is better at concentrating references than either bibliographic coupling or co-citation. Using the assumption that higher concentrations of references denote more accurate clusters, direct citation thus provides a more accurate representation of the taxonomy of scientific and technical knowledge than either bibliographic coupling or co-citation. We also find that discipline-level taxonomies based on journal schema are highly inaccurate compared to topic-level taxonomies, and recommend against their use.
  4. Klavans, R.; Boyack, K.W.: Identifying a better measure of relatedness for mapping science (2006) 0.01
    0.012449317 = product of:
      0.03734795 = sum of:
        0.03734795 = product of:
          0.0746959 = sum of:
            0.0746959 = weight(_text_:index in 5252) [ClassicSimilarity], result of:
              0.0746959 = score(doc=5252,freq=4.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.3413878 = fieldWeight in 5252, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5252)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Measuring the relatedness between bibliometric units (journals, documents, authors, or words) is a central task in bibliometric analysis. Relatedness measures are used for many different tasks, among them the generating of maps, or visual pictures, showing the relationship between all items from these data. Despite the importance of these tasks, there has been little written an how to quantitatively evaluate the accuracy of relatedness measures or the resulting maps. The authors propose a new framework for assessing the performance of relatedness measures and visualization algorithms that contains four factors: accuracy, coverage, scalability, and robustness. This method was applied to 10 measures of journal-journal relatedness to determine the best measure. The 10 relatedness measures were then used as inputs to a visualization algorithm to create an additional 10 measures of journal-journal relatedness based an the distances between pairs of journals in two-dimensional space. This second step determines robustness (i.e., which measure remains best after dimension reduction). Results show that, for low coverage (under 50%), the Pearson correlation is the most accurate raw relatedness measure. However, the best overall measure, both at high coverage, and after dimension reduction, is the cosine index or a modified cosine index. Results also showed that the visualization algorithm increased local accuracy for most measures. Possible reasons for this counterintuitive finding are discussed.