Search (10 results, page 1 of 1)

  • × author_ss:"Boyack, K.W."
  1. Boyack, K.W.; Klavans, R.: Co-citation analysis, bibliographic coupling, and direct citation : which citation approach represents the research front most accurately? (2010) 0.02
    0.016871057 = product of:
      0.08435529 = sum of:
        0.08435529 = weight(_text_:bibliographic in 4111) [ClassicSimilarity], result of:
          0.08435529 = score(doc=4111,freq=10.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.480894 = fieldWeight in 4111, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4111)
      0.2 = coord(1/5)
    
    Abstract
    In the past several years studies have started to appear comparing the accuracies of various science mapping approaches. These studies primarily compare the cluster solutions resulting from different similarity approaches, and give varying results. In this study we compare the accuracies of cluster solutions of a large corpus of 2,153,769 recent articles from the biomedical literature (2004-2008) using four similarity approaches: co-citation analysis, bibliographic coupling, direct citation, and a bibliographic coupling-based citation-text hybrid approach. Each of the four approaches can be considered a way to represent the research front in biomedicine, and each is able to successfully cluster over 92% of the corpus. Accuracies are compared using two metrics-within-cluster textual coherence as defined by the Jensen-Shannon divergence, and a concentration measure based on the grant-to-article linkages indexed in MEDLINE. Of the three pure citation-based approaches, bibliographic coupling slightly outperforms co-citation analysis using both accuracy measures; direct citation is the least accurate mapping approach by far. The hybrid approach improves upon the bibliographic coupling results in all respects. We consider the results of this study to be robust given the very large size of the corpus, and the specificity of the accuracy measures used.
  2. Boyack, K.W.; Small, H.; Klavans, R.: Improving the accuracy of co-citation clustering using full text (2013) 0.02
    0.015681919 = product of:
      0.07840959 = sum of:
        0.07840959 = weight(_text_:bibliographic in 1036) [ClassicSimilarity], result of:
          0.07840959 = score(doc=1036,freq=6.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.44699866 = fieldWeight in 1036, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=1036)
      0.2 = coord(1/5)
    
    Abstract
    Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 full text documents from 2007, we compare the results of traditional co-citation clustering using only the bibliographic information to results from co-citation clustering where proximity between reference pairs is factored into the pairwise relationships. We find that accounting for reference proximity from full text can increase the textual coherence (a measure of accuracy) of a co-citation cluster solution by up to 30% over the traditional approach based on bibliographic information.
  3. Börner, K.; Chen, C.; Boyack, K.W.: Visualizing knowledge domains (2002) 0.02
    0.015490483 = product of:
      0.038726207 = sum of:
        0.026407382 = weight(_text_:bibliographic in 4286) [ClassicSimilarity], result of:
          0.026407382 = score(doc=4286,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.15054363 = fieldWeight in 4286, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4286)
        0.012318826 = product of:
          0.024637653 = sum of:
            0.024637653 = weight(_text_:data in 4286) [ClassicSimilarity], result of:
              0.024637653 = score(doc=4286,freq=4.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.17292464 = fieldWeight in 4286, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4286)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    This chapter reviews visualization techniques that can be used to map the ever-growing domain structure of scientific disciplines and to support information retrieval and classification. In contrast to the comprehensive surveys conducted in traditional fashion by Howard White and Katherine McCain (1997, 1998), this survey not only reviews emerging techniques in interactive data analysis and information visualization, but also depicts the bibliographical structure of the field itself. The chapter starts by reviewing the history of knowledge domain visualization. We then present a general process flow for the visualization of knowledge domains and explain commonly used techniques. In order to visualize the domain reviewed by this chapter, we introduce a bibliographic data set of considerable size, which includes articles from the citation analysis, bibliometrics, semantics, and visualization literatures. Using tutorial style, we then apply various algorithms to demonstrate the visualization effectsl produced by different approaches and compare the results. The domain visualizations reveal the relationships within and between the four fields that together constitute the focus of this chapter. We conclude with a general discussion of research possibilities. Painting a "big picture" of scientific knowledge has long been desirable for a variety of reasons. Traditional approaches are brute forcescholars must sort through mountains of literature to perceive the outlines of their field. Obviously, this is time-consuming, difficult to replicate, and entails subjective judgments. The task is enormously complex. Sifting through recently published documents to find those that will later be recognized as important is labor intensive. Traditional approaches struggle to keep up with the pace of information growth. In multidisciplinary fields of study it is especially difficult to maintain an overview of literature dynamics. Painting the big picture of an everevolving scientific discipline is akin to the situation described in the widely known Indian legend about the blind men and the elephant. As the story goes, six blind men were trying to find out what an elephant looked like. They touched different parts of the elephant and quickly jumped to their conclusions. The one touching the body said it must be like a wall; the one touching the tail said it was like a snake; the one touching the legs said it was like a tree trunk, and so forth. But science does not stand still; the steady stream of new scientific literature creates a continuously changing structure. The resulting disappearance, fusion, and emergence of research areas add another twist to the tale-it is as if the elephant is running and dynamically changing its shape. Domain visualization, an emerging field of study, is in a similar situation. Relevant literature is spread across disciplines that have traditionally had few connections. Researchers examining the domain from a particular discipline cannot possibly have an adequate understanding of the whole. As noted by White and McCain (1997), the new generation of information scientists is technically driven in its efforts to visualize scientific disciplines. However, limited progress has been made in terms of connecting pioneers' theories and practices with the potentialities of today's enabling technologies. If the difference between past and present generations lies in the power of available technologies, what they have in common is the ultimate goal-to reveal the development of scientific knowledge.
  4. Klavans, K.; Boyack, K.W.: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? (2017) 0.01
    0.013068265 = product of:
      0.06534132 = sum of:
        0.06534132 = weight(_text_:bibliographic in 3535) [ClassicSimilarity], result of:
          0.06534132 = score(doc=3535,freq=6.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.3724989 = fieldWeight in 3535, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3535)
      0.2 = coord(1/5)
    
    Abstract
    In 1965, Price foresaw the day when a citation-based taxonomy of science and technology would be delineated and correspondingly used for science policy. A taxonomy needs to be comprehensive and accurate if it is to be useful for policy making, especially now that policy makers are utilizing citation-based indicators to evaluate people, institutions and laboratories. Determining the accuracy of a taxonomy, however, remains a challenge. Previous work on the accuracy of partition solutions is sparse, and the results of those studies, although useful, have not been definitive. In this study we compare the accuracies of topic-level taxonomies based on the clustering of documents using direct citation, bibliographic coupling, and co-citation. Using a set of new gold standards-articles with at least 100 references-we find that direct citation is better at concentrating references than either bibliographic coupling or co-citation. Using the assumption that higher concentrations of references denote more accurate clusters, direct citation thus provides a more accurate representation of the taxonomy of scientific and technical knowledge than either bibliographic coupling or co-citation. We also find that discipline-level taxonomies based on journal schema are highly inaccurate compared to topic-level taxonomies, and recommend against their use.
  5. Boyack, K.W.; Klavans, R.: Creation of a highly detailed, dynamic, global model and map of science (2014) 0.01
    0.0090539595 = product of:
      0.045269795 = sum of:
        0.045269795 = weight(_text_:bibliographic in 1230) [ClassicSimilarity], result of:
          0.045269795 = score(doc=1230,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.2580748 = fieldWeight in 1230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=1230)
      0.2 = coord(1/5)
    
    Abstract
    The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science covering 16 years and nearly 20 million articles using cocitation-based techniques. The map is then used to define discipline-like structures consisting of natural groupings of articles and clusters of articles. This combination of detail and high-level structure can be used to address planning-related problems such as identification of emerging topics and the identification of which areas of science and technology are innovative and which are simply persisting. In addition to presenting the model and map, several process improvements that result in greater accuracy structures are detailed, including a bibliographic coupling approach for assigning current papers to cocitation clusters and a sequential hybrid approach to producing visual maps from models.
  6. Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.01
    0.008633437 = product of:
      0.04316718 = sum of:
        0.04316718 = product of:
          0.08633436 = sum of:
            0.08633436 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
              0.08633436 = score(doc=1352,freq=4.0), product of:
                0.15778607 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04505818 = queryNorm
                0.54716086 = fieldWeight in 1352, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1352)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:17:40
  7. Klavans, R.; Boyack, K.W.: Toward a consensus map of science (2009) 0.00
    0.0036628568 = product of:
      0.018314283 = sum of:
        0.018314283 = product of:
          0.036628567 = sum of:
            0.036628567 = weight(_text_:22 in 2736) [ClassicSimilarity], result of:
              0.036628567 = score(doc=2736,freq=2.0), product of:
                0.15778607 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04505818 = queryNorm
                0.23214069 = fieldWeight in 2736, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2736)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    22. 3.2009 12:49:33
  8. Klavans, R.; Boyack, K.W.: Identifying a better measure of relatedness for mapping science (2006) 0.00
    0.0024887787 = product of:
      0.012443894 = sum of:
        0.012443894 = product of:
          0.024887787 = sum of:
            0.024887787 = weight(_text_:data in 5252) [ClassicSimilarity], result of:
              0.024887787 = score(doc=5252,freq=2.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.17468026 = fieldWeight in 5252, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5252)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Measuring the relatedness between bibliometric units (journals, documents, authors, or words) is a central task in bibliometric analysis. Relatedness measures are used for many different tasks, among them the generating of maps, or visual pictures, showing the relationship between all items from these data. Despite the importance of these tasks, there has been little written an how to quantitatively evaluate the accuracy of relatedness measures or the resulting maps. The authors propose a new framework for assessing the performance of relatedness measures and visualization algorithms that contains four factors: accuracy, coverage, scalability, and robustness. This method was applied to 10 measures of journal-journal relatedness to determine the best measure. The 10 relatedness measures were then used as inputs to a visualization algorithm to create an additional 10 measures of journal-journal relatedness based an the distances between pairs of journals in two-dimensional space. This second step determines robustness (i.e., which measure remains best after dimension reduction). Results show that, for low coverage (under 50%), the Pearson correlation is the most accurate raw relatedness measure. However, the best overall measure, both at high coverage, and after dimension reduction, is the cosine index or a modified cosine index. Results also showed that the visualization algorithm increased local accuracy for most measures. Possible reasons for this counterintuitive finding are discussed.
  9. Klavans, R.; Boyack, K.W.: Using global mapping to create more accurate document-level maps of research fields (2011) 0.00
    0.0024887787 = product of:
      0.012443894 = sum of:
        0.012443894 = product of:
          0.024887787 = sum of:
            0.024887787 = weight(_text_:data in 4956) [ClassicSimilarity], result of:
              0.024887787 = score(doc=4956,freq=2.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.17468026 = fieldWeight in 4956, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4956)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    We describe two general approaches to creating document-level maps of science. To create a local map, one defines and directly maps a sample of data, such as all literature published in a set of information science journals. To create a global map of a research field, one maps "all of science" and then locates a literature sample within that full context. We provide a deductive argument that global mapping should create more accurate partitions of a research field than does local mapping, followed by practical reasons why this may not be so. The field of information science is then mapped at the document level using both local and global methods to provide a case illustration of the differences between the methods. Textual coherence is used to assess the accuracies of both maps. We find that document clusters in the global map have significantly higher coherence than do those in the local map, and that the global map provides unique insights into the field of information science that cannot be discerned from the local map. Specifically, we show that information science and computer science have a large interface and that computer science is the more progressive discipline at that interface. We also show that research communities in temporally linked threads have a much higher coherence than do isolated communities, and that this feature can be used to predict which threads will persist into a subsequent year. Methods that could increase the accuracy of both local and global maps in the future also are discussed.
  10. Boyack, K.W.; Wylie, B.N.; Davidson, G.S.: Domain visualization using VxInsight®) [register mark] for science and technology management (2002) 0.00
    0.001991023 = product of:
      0.009955115 = sum of:
        0.009955115 = product of:
          0.01991023 = sum of:
            0.01991023 = weight(_text_:data in 5244) [ClassicSimilarity], result of:
              0.01991023 = score(doc=5244,freq=2.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.1397442 = fieldWeight in 5244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5244)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Boyack, Wylie, and Davidson developed VxInsight which transforms information from documents into a landscape representation which conveys information on the implicit structure of the data as context for queries and exploration. From a list of pre-computed similarities it creates on a plane an x,y location for each item, or can compute its own similarities based on direct and co-citation linkages. Three-dimensional overlays are then generated on the plane to show the extent of clustering at particular points. Metadata associated with clustered objects provides a label for each peak from common words. Clicking on an object will provide citation information and answer sets for queries run will be displayed as markers on the landscape. A time slider allows a view of terrain changes over time. In a test on the microsystems engineering literature a review article was used to provide seed terms to search Science Citation Index and retrieve 20,923 articles of which 13,433 were connected by citation to at least one other article in the set. The citation list was used to calculate similarity measures and x.y coordinates for each article. Four main categories made up the landscape with 90% of the articles directly related to one or more of the four. A second test used five databases: SCI, Cambridge Scientific Abstracts, Engineering Index, INSPEC, and Medline to extract 17,927 unique articles by Sandia, Los Alamos National Laboratory, and Lawrence Livermore National Laboratory, with text of abstracts and RetrievalWare 6.6 utilized to generate the similarity measures. The subsequent map revealed that despite some overlap the laboratories generally publish in different areas. A third test on 3000 physical science journals utilized 4.7 million articles from SCI where similarity was the un-normalized sum of cites between journals in both directions. Physics occupies a central position, with engineering, mathematics, computing, and materials science strongly linked. Chemistry is farther removed but strongly connected.