Search (8 results, page 1 of 1)

  • × author_ss:"Boyack, K.W."
  1. Boyack, K.W.; Small, H.; Klavans, R.: Improving the accuracy of co-citation clustering using full text (2013) 0.02
    0.017623993 = product of:
      0.052871976 = sum of:
        0.052871976 = weight(_text_:based in 1036) [ClassicSimilarity], result of:
          0.052871976 = score(doc=1036,freq=6.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.34595144 = fieldWeight in 1036, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=1036)
      0.33333334 = coord(1/3)
    
    Abstract
    Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 full text documents from 2007, we compare the results of traditional co-citation clustering using only the bibliographic information to results from co-citation clustering where proximity between reference pairs is factored into the pairwise relationships. We find that accounting for reference proximity from full text can increase the textual coherence (a measure of accuracy) of a co-citation cluster solution by up to 30% over the traditional approach based on bibliographic information.
  2. Klavans, K.; Boyack, K.W.: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? (2017) 0.02
    0.016958695 = product of:
      0.050876085 = sum of:
        0.050876085 = weight(_text_:based in 3535) [ClassicSimilarity], result of:
          0.050876085 = score(doc=3535,freq=8.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.33289194 = fieldWeight in 3535, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3535)
      0.33333334 = coord(1/3)
    
    Abstract
    In 1965, Price foresaw the day when a citation-based taxonomy of science and technology would be delineated and correspondingly used for science policy. A taxonomy needs to be comprehensive and accurate if it is to be useful for policy making, especially now that policy makers are utilizing citation-based indicators to evaluate people, institutions and laboratories. Determining the accuracy of a taxonomy, however, remains a challenge. Previous work on the accuracy of partition solutions is sparse, and the results of those studies, although useful, have not been definitive. In this study we compare the accuracies of topic-level taxonomies based on the clustering of documents using direct citation, bibliographic coupling, and co-citation. Using a set of new gold standards-articles with at least 100 references-we find that direct citation is better at concentrating references than either bibliographic coupling or co-citation. Using the assumption that higher concentrations of references denote more accurate clusters, direct citation thus provides a more accurate representation of the taxonomy of scientific and technical knowledge than either bibliographic coupling or co-citation. We also find that discipline-level taxonomies based on journal schema are highly inaccurate compared to topic-level taxonomies, and recommend against their use.
  3. Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.02
    0.016198358 = product of:
      0.04859507 = sum of:
        0.04859507 = product of:
          0.09719014 = sum of:
            0.09719014 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
              0.09719014 = score(doc=1352,freq=4.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.54716086 = fieldWeight in 1352, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1352)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:17:40
  4. Boyack, K.W.; Klavans, R.: Co-citation analysis, bibliographic coupling, and direct citation : which citation approach represents the research front most accurately? (2010) 0.01
    0.014686662 = product of:
      0.044059984 = sum of:
        0.044059984 = weight(_text_:based in 4111) [ClassicSimilarity], result of:
          0.044059984 = score(doc=4111,freq=6.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.28829288 = fieldWeight in 4111, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4111)
      0.33333334 = coord(1/3)
    
    Abstract
    In the past several years studies have started to appear comparing the accuracies of various science mapping approaches. These studies primarily compare the cluster solutions resulting from different similarity approaches, and give varying results. In this study we compare the accuracies of cluster solutions of a large corpus of 2,153,769 recent articles from the biomedical literature (2004-2008) using four similarity approaches: co-citation analysis, bibliographic coupling, direct citation, and a bibliographic coupling-based citation-text hybrid approach. Each of the four approaches can be considered a way to represent the research front in biomedicine, and each is able to successfully cluster over 92% of the corpus. Accuracies are compared using two metrics-within-cluster textual coherence as defined by the Jensen-Shannon divergence, and a concentration measure based on the grant-to-article linkages indexed in MEDLINE. Of the three pure citation-based approaches, bibliographic coupling slightly outperforms co-citation analysis using both accuracy measures; direct citation is the least accurate mapping approach by far. The hybrid approach improves upon the bibliographic coupling results in all respects. We consider the results of this study to be robust given the very large size of the corpus, and the specificity of the accuracy measures used.
  5. Boyack, K.W.; Klavans, R.: Creation of a highly detailed, dynamic, global model and map of science (2014) 0.01
    0.010175217 = product of:
      0.03052565 = sum of:
        0.03052565 = weight(_text_:based in 1230) [ClassicSimilarity], result of:
          0.03052565 = score(doc=1230,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.19973516 = fieldWeight in 1230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=1230)
      0.33333334 = coord(1/3)
    
    Abstract
    The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science covering 16 years and nearly 20 million articles using cocitation-based techniques. The map is then used to define discipline-like structures consisting of natural groupings of articles and clusters of articles. This combination of detail and high-level structure can be used to address planning-related problems such as identification of emerging topics and the identification of which areas of science and technology are innovative and which are simply persisting. In addition to presenting the model and map, several process improvements that result in greater accuracy structures are detailed, including a bibliographic coupling approach for assigning current papers to cocitation clusters and a sequential hybrid approach to producing visual maps from models.
  6. Klavans, R.; Boyack, K.W.: Identifying a better measure of relatedness for mapping science (2006) 0.01
    0.008479347 = product of:
      0.025438042 = sum of:
        0.025438042 = weight(_text_:based in 5252) [ClassicSimilarity], result of:
          0.025438042 = score(doc=5252,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.16644597 = fieldWeight in 5252, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5252)
      0.33333334 = coord(1/3)
    
    Abstract
    Measuring the relatedness between bibliometric units (journals, documents, authors, or words) is a central task in bibliometric analysis. Relatedness measures are used for many different tasks, among them the generating of maps, or visual pictures, showing the relationship between all items from these data. Despite the importance of these tasks, there has been little written an how to quantitatively evaluate the accuracy of relatedness measures or the resulting maps. The authors propose a new framework for assessing the performance of relatedness measures and visualization algorithms that contains four factors: accuracy, coverage, scalability, and robustness. This method was applied to 10 measures of journal-journal relatedness to determine the best measure. The 10 relatedness measures were then used as inputs to a visualization algorithm to create an additional 10 measures of journal-journal relatedness based an the distances between pairs of journals in two-dimensional space. This second step determines robustness (i.e., which measure remains best after dimension reduction). Results show that, for low coverage (under 50%), the Pearson correlation is the most accurate raw relatedness measure. However, the best overall measure, both at high coverage, and after dimension reduction, is the cosine index or a modified cosine index. Results also showed that the visualization algorithm increased local accuracy for most measures. Possible reasons for this counterintuitive finding are discussed.
  7. Klavans, R.; Boyack, K.W.: Toward a consensus map of science (2009) 0.01
    0.006872381 = product of:
      0.020617142 = sum of:
        0.020617142 = product of:
          0.041234285 = sum of:
            0.041234285 = weight(_text_:22 in 2736) [ClassicSimilarity], result of:
              0.041234285 = score(doc=2736,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.23214069 = fieldWeight in 2736, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2736)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 3.2009 12:49:33
  8. Boyack, K.W.; Wylie, B.N.; Davidson, G.S.: Domain visualization using VxInsight®) [register mark] for science and technology management (2002) 0.01
    0.006783478 = product of:
      0.020350434 = sum of:
        0.020350434 = weight(_text_:based in 5244) [ClassicSimilarity], result of:
          0.020350434 = score(doc=5244,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.13315678 = fieldWeight in 5244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03125 = fieldNorm(doc=5244)
      0.33333334 = coord(1/3)
    
    Abstract
    Boyack, Wylie, and Davidson developed VxInsight which transforms information from documents into a landscape representation which conveys information on the implicit structure of the data as context for queries and exploration. From a list of pre-computed similarities it creates on a plane an x,y location for each item, or can compute its own similarities based on direct and co-citation linkages. Three-dimensional overlays are then generated on the plane to show the extent of clustering at particular points. Metadata associated with clustered objects provides a label for each peak from common words. Clicking on an object will provide citation information and answer sets for queries run will be displayed as markers on the landscape. A time slider allows a view of terrain changes over time. In a test on the microsystems engineering literature a review article was used to provide seed terms to search Science Citation Index and retrieve 20,923 articles of which 13,433 were connected by citation to at least one other article in the set. The citation list was used to calculate similarity measures and x.y coordinates for each article. Four main categories made up the landscape with 90% of the articles directly related to one or more of the four. A second test used five databases: SCI, Cambridge Scientific Abstracts, Engineering Index, INSPEC, and Medline to extract 17,927 unique articles by Sandia, Los Alamos National Laboratory, and Lawrence Livermore National Laboratory, with text of abstracts and RetrievalWare 6.6 utilized to generate the similarity measures. The subsequent map revealed that despite some overlap the laboratories generally publish in different areas. A third test on 3000 physical science journals utilized 4.7 million articles from SCI where similarity was the un-normalized sum of cites between journals in both directions. Physics occupies a central position, with engineering, mathematics, computing, and materials science strongly linked. Chemistry is farther removed but strongly connected.