Document (#40037)

Title
Graphic details : a scientific study of the importance of diagrams to science
Source
Economist. 2016, Jun 18th
Year
2016
Abstract
A PICTURE is said to be worth a thousand words. That metaphor might be expected to pertain a fortiori in the case of scientific papers, where a figure can brilliantly illuminate an idea that might otherwise be baffling. Papers with figures in them should thus be easier to grasp than those without. They should therefore reach larger audiences and, in turn, be more influential simply by virtue of being more widely read. But are they?
Content
Bill Howe and his colleagues at the University of Washington, in Seattle, decided to find out. First, they trained a computer algorithm to distinguish between various sorts of figures-which they defined as diagrams, equations, photographs, plots (such as bar charts and scatter graphs) and tables. They exposed their algorithm to between 400 and 600 images of each of these types of figure until it could distinguish them with an accuracy greater than 90%. Then they set it loose on the more-than-650,000 papers (containing more than 10m figures) stored on PubMed Central, an online archive of biomedical-research articles. To measure each paper's influence, they calculated its article-level Eigenfactor score-a modified version of the PageRank algorithm Google uses to provide the most relevant results for internet searches. Eigenfactor scoring gives a better measure than simply noting the number of times a paper is cited elsewhere, because it weights citations by their influence. A citation in a paper that is itself highly cited is worth more than one in a paper that is not.
As the team describe in a paper posted (http://arxiv.org/abs/1605.04951) on arXiv, they found that figures did indeed matter-but not all in the same way. An average paper in PubMed Central has about one diagram for every three pages and gets 1.67 citations. Papers with more diagrams per page and, to a lesser extent, plots per page tended to be more influential (on average, a paper accrued two more citations for every extra diagram per page, and one more for every extra plot per page). By contrast, including photographs and equations seemed to decrease the chances of a paper being cited by others. That agrees with a study from 2012, whose authors counted (by hand) the number of mathematical expressions in over 600 biology papers and found that each additional equation per page reduced the number of citations a paper received by 22%. This does not mean that researchers should rush to include more diagrams in their next paper. Dr Howe has not shown what is behind the effect, which may merely be one of correlation, rather than causation. It could, for example, be that papers with lots of diagrams tend to be those that illustrate new concepts, and thus start a whole new field of inquiry. Such papers will certainly be cited a lot. On the other hand, the presence of equations really might reduce citations. Biologists (as are most of those who write and read the papers in PubMed Central) are notoriously mathsaverse. If that is the case, looking in a physics archive would probably produce a different result.
Dr Howe and his colleagues do, however, believe that the study of diagrams can result in new insights. A figure showing new metabolic pathways in a cell, for example, may summarise hundreds of experiments. Since illustrations can convey important scientific concepts in this way, they think that browsing through related figures from different papers may help researchers come up with new theories. As Dr Howe puts it, "the unit of scientific currency is closer to the figure than to the paper." With this thought in mind, the team have created a website (viziometrics.org (http://viziometrics.org/) ) where the millions of images sorted by their program can be searched using key words. Their next plan is to extract the information from particular types of scientific figure, to create comprehensive "super" figures: a giant network of all the known chemical processes in a cell for example, or the best-available tree of life. At just one such superfigure per paper, though, the citation records of articles containing such all-embracing diagrams may very well undermine the correlation that prompted their creation in the first place. Call it the ultimate marriage of chart and science.
Footnote
Vgl.: http://www.economist.com/news/science-and-technology/21700620-surprisingly-simple-test-check-research-papers-errors-come-again.
Theme
Visualisierung

Similar documents (content)

  1. Dewey, M.: ¬A classification and subject index for cataloguing and arranging the books and pamphlets of a library (1876) 0.15
    0.15112261 = sum of:
      0.15112261 = product of:
        0.3434605 = sum of:
          0.03098431 = weight(abstract_txt:simply in 985) [ClassicSimilarity], result of:
            0.03098431 = score(doc=985,freq=2.0), product of:
              0.16513662 = queryWeight, product of:
                1.0317587 = boost
                6.792872 = idf(docFreq=128, maxDocs=42306)
                0.023561982 = queryNorm
              0.18762834 = fieldWeight in 985, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.792872 = idf(docFreq=128, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.011363894 = weight(abstract_txt:more in 985) [ClassicSimilarity], result of:
            0.011363894 = score(doc=985,freq=4.0), product of:
              0.08461232 = queryWeight, product of:
                1.0444514 = boost
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.023561982 = queryNorm
              0.13430543 = fieldWeight in 985, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.027146673 = weight(abstract_txt:said in 985) [ClassicSimilarity], result of:
            0.027146673 = score(doc=985,freq=1.0), product of:
              0.19050361 = queryWeight, product of:
                1.1081742 = boost
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.023561982 = queryNorm
              0.14249952 = fieldWeight in 985, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.027892904 = weight(abstract_txt:otherwise in 985) [ClassicSimilarity], result of:
            0.027892904 = score(doc=985,freq=1.0), product of:
              0.19397897 = queryWeight, product of:
                1.1182368 = boost
                7.3622246 = idf(docFreq=72, maxDocs=42306)
                0.023561982 = queryNorm
              0.14379345 = fieldWeight in 985, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3622246 = idf(docFreq=72, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.050020114 = weight(abstract_txt:figures in 985) [ClassicSimilarity], result of:
            0.050020114 = score(doc=985,freq=3.0), product of:
              0.19852485 = queryWeight, product of:
                1.1312637 = boost
                7.4479914 = idf(docFreq=66, maxDocs=42306)
                0.023561982 = queryNorm
              0.25195897 = fieldWeight in 985, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.4479914 = idf(docFreq=66, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.017035726 = weight(abstract_txt:they in 985) [ClassicSimilarity], result of:
            0.017035726 = score(doc=985,freq=5.0), product of:
              0.10288513 = queryWeight, product of:
                1.1517222 = boost
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.023561982 = queryNorm
              0.16558006 = fieldWeight in 985, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.07614239 = weight(abstract_txt:figure in 985) [ClassicSimilarity], result of:
            0.07614239 = score(doc=985,freq=5.0), product of:
              0.22157453 = queryWeight, product of:
                1.1951333 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.023561982 = queryNorm
              0.34364235 = fieldWeight in 985, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.04079379 = weight(abstract_txt:thousand in 985) [ClassicSimilarity], result of:
            0.04079379 = score(doc=985,freq=1.0), product of:
              0.2499318 = queryWeight, product of:
                1.2693086 = boost
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.023561982 = queryNorm
              0.16321969 = fieldWeight in 985, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.017580973 = weight(abstract_txt:should in 985) [ClassicSimilarity], result of:
            0.017580973 = score(doc=985,freq=2.0), product of:
              0.1426004 = queryWeight, product of:
                1.3559129 = boost
                4.463516 = idf(docFreq=1324, maxDocs=42306)
                0.023561982 = queryNorm
              0.12328838 = fieldWeight in 985, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.463516 = idf(docFreq=1324, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.014531429 = weight(abstract_txt:scientific in 985) [ClassicSimilarity], result of:
            0.014531429 = score(doc=985,freq=1.0), product of:
              0.15823688 = queryWeight, product of:
                1.4283192 = boost
                4.7018695 = idf(docFreq=1043, maxDocs=42306)
                0.023561982 = queryNorm
              0.09183339 = fieldWeight in 985, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7018695 = idf(docFreq=1043, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
          0.029968254 = weight(abstract_txt:might in 985) [ClassicSimilarity], result of:
            0.029968254 = score(doc=985,freq=2.0), product of:
              0.20348534 = queryWeight, product of:
                1.6197127 = boost
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.023561982 = queryNorm
              0.14727476 = fieldWeight in 985, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.01953125 = fieldNorm(doc=985)
        0.44 = coord(11/25)
    
  2. Rolling, L.: ¬The role of graphic display of concept relationships in indexing and retrieval vocabularies (1985) 0.14
    0.14175123 = sum of:
      0.14175123 = product of:
        0.5062544 = sum of:
          0.047874436 = weight(abstract_txt:easier in 4647) [ClassicSimilarity], result of:
            0.047874436 = score(doc=4647,freq=1.0), product of:
              0.1551269 = queryWeight, product of:
                6.58378 = idf(docFreq=158, maxDocs=42306)
                0.023561982 = queryNorm
              0.30861467 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.58378 = idf(docFreq=158, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.023619412 = weight(abstract_txt:more in 4647) [ClassicSimilarity], result of:
            0.023619412 = score(doc=4647,freq=3.0), product of:
              0.08461232 = queryWeight, product of:
                1.0444514 = boost
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.023561982 = queryNorm
              0.2791486 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.06549831 = weight(abstract_txt:worth in 4647) [ClassicSimilarity], result of:
            0.06549831 = score(doc=4647,freq=1.0), product of:
              0.19117807 = queryWeight, product of:
                1.1101341 = boost
                7.308879 = idf(docFreq=76, maxDocs=42306)
                0.023561982 = queryNorm
              0.34260368 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.308879 = idf(docFreq=76, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.01828466 = weight(abstract_txt:they in 4647) [ClassicSimilarity], result of:
            0.01828466 = score(doc=4647,freq=1.0), product of:
              0.10288513 = queryWeight, product of:
                1.1517222 = boost
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.023561982 = queryNorm
              0.17771918 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.17134786 = weight(abstract_txt:diagrams in 4647) [ClassicSimilarity], result of:
            0.17134786 = score(doc=4647,freq=5.0), product of:
              0.21226601 = queryWeight, product of:
                1.1697598 = boost
                7.7014403 = idf(docFreq=51, maxDocs=42306)
                0.023561982 = queryNorm
              0.8072317 = fieldWeight in 4647, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.7014403 = idf(docFreq=51, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.08172459 = weight(abstract_txt:figure in 4647) [ClassicSimilarity], result of:
            0.08172459 = score(doc=4647,freq=1.0), product of:
              0.22157453 = queryWeight, product of:
                1.1951333 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.023561982 = queryNorm
              0.3688357 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
          0.09790509 = weight(abstract_txt:thousand in 4647) [ClassicSimilarity], result of:
            0.09790509 = score(doc=4647,freq=1.0), product of:
              0.2499318 = queryWeight, product of:
                1.2693086 = boost
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.023561982 = queryNorm
              0.39172724 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.046875 = fieldNorm(doc=4647)
        0.28 = coord(7/25)
    
  3. Hedden, H.: Creating an index for your Web site to make info easier to see (2006) 0.09
    0.094408646 = sum of:
      0.094408646 = product of:
        0.47204322 = sum of:
          0.12269161 = weight(abstract_txt:simply in 9) [ClassicSimilarity], result of:
            0.12269161 = score(doc=9,freq=1.0), product of:
              0.16513662 = queryWeight, product of:
                1.0317587 = boost
                6.792872 = idf(docFreq=128, maxDocs=42306)
                0.023561982 = queryNorm
              0.74297035 = fieldWeight in 9, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.792872 = idf(docFreq=128, maxDocs=42306)
                0.109375 = fieldNorm(doc=9)
          0.031818904 = weight(abstract_txt:more in 9) [ClassicSimilarity], result of:
            0.031818904 = score(doc=9,freq=1.0), product of:
              0.08461232 = queryWeight, product of:
                1.0444514 = boost
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.023561982 = queryNorm
              0.3760552 = fieldWeight in 9, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.109375 = fieldNorm(doc=9)
          0.15620026 = weight(abstract_txt:otherwise in 9) [ClassicSimilarity], result of:
            0.15620026 = score(doc=9,freq=1.0), product of:
              0.19397897 = queryWeight, product of:
                1.1182368 = boost
                7.3622246 = idf(docFreq=72, maxDocs=42306)
                0.023561982 = queryNorm
              0.8052433 = fieldWeight in 9, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3622246 = idf(docFreq=72, maxDocs=42306)
                0.109375 = fieldNorm(doc=9)
          0.042664208 = weight(abstract_txt:they in 9) [ClassicSimilarity], result of:
            0.042664208 = score(doc=9,freq=1.0), product of:
              0.10288513 = queryWeight, product of:
                1.1517222 = boost
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.023561982 = queryNorm
              0.4146781 = fieldWeight in 9, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7913425 = idf(docFreq=2594, maxDocs=42306)
                0.109375 = fieldNorm(doc=9)
          0.11866823 = weight(abstract_txt:might in 9) [ClassicSimilarity], result of:
            0.11866823 = score(doc=9,freq=1.0), product of:
              0.20348534 = queryWeight, product of:
                1.6197127 = boost
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.023561982 = queryNorm
              0.5831783 = fieldWeight in 9, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.109375 = fieldNorm(doc=9)
        0.2 = coord(5/25)
    
  4. Prathap, G.: Measures for impact, consistency, and the h- and g-indices (2014) 0.09
    0.08694366 = sum of:
      0.08694366 = product of:
        0.5433979 = sum of:
          0.031818904 = weight(abstract_txt:more in 3251) [ClassicSimilarity], result of:
            0.031818904 = score(doc=3251,freq=1.0), product of:
              0.08461232 = queryWeight, product of:
                1.0444514 = boost
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.023561982 = queryNorm
              0.3760552 = fieldWeight in 3251, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.109375 = fieldNorm(doc=3251)
          0.27409902 = weight(abstract_txt:virtue in 3251) [ClassicSimilarity], result of:
            0.27409902 = score(doc=3251,freq=1.0), product of:
              0.2822096 = queryWeight, product of:
                1.3487837 = boost
                8.8800955 = idf(docFreq=15, maxDocs=42306)
                0.023561982 = queryNorm
              0.9712604 = fieldWeight in 3251, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.8800955 = idf(docFreq=15, maxDocs=42306)
                0.109375 = fieldNorm(doc=3251)
          0.12058035 = weight(abstract_txt:should in 3251) [ClassicSimilarity], result of:
            0.12058035 = score(doc=3251,freq=3.0), product of:
              0.1426004 = queryWeight, product of:
                1.3559129 = boost
                4.463516 = idf(docFreq=1324, maxDocs=42306)
                0.023561982 = queryNorm
              0.8455821 = fieldWeight in 3251, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.463516 = idf(docFreq=1324, maxDocs=42306)
                0.109375 = fieldNorm(doc=3251)
          0.11689965 = weight(abstract_txt:papers in 3251) [ClassicSimilarity], result of:
            0.11689965 = score(doc=3251,freq=1.0), product of:
              0.2014585 = queryWeight, product of:
                1.6116259 = boost
                5.305295 = idf(docFreq=570, maxDocs=42306)
                0.023561982 = queryNorm
              0.58026665 = fieldWeight in 3251, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.305295 = idf(docFreq=570, maxDocs=42306)
                0.109375 = fieldNorm(doc=3251)
        0.16 = coord(4/25)
    
  5. Kollewe, W.: Data representation by nested line diagrams illustrated by a survey of pensioners (1991) 0.08
    0.0841754 = sum of:
      0.0841754 = product of:
        0.5260963 = sum of:
          0.022727787 = weight(abstract_txt:more in 5230) [ClassicSimilarity], result of:
            0.022727787 = score(doc=5230,freq=1.0), product of:
              0.08461232 = queryWeight, product of:
                1.0444514 = boost
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.023561982 = queryNorm
              0.26861086 = fieldWeight in 5230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.438219 = idf(docFreq=3693, maxDocs=42306)
                0.078125 = fieldNorm(doc=5230)
          0.2554303 = weight(abstract_txt:diagrams in 5230) [ClassicSimilarity], result of:
            0.2554303 = score(doc=5230,freq=4.0), product of:
              0.21226601 = queryWeight, product of:
                1.1697598 = boost
                7.7014403 = idf(docFreq=51, maxDocs=42306)
                0.023561982 = queryNorm
              1.2033501 = fieldWeight in 5230, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.7014403 = idf(docFreq=51, maxDocs=42306)
                0.078125 = fieldNorm(doc=5230)
          0.16317517 = weight(abstract_txt:thousand in 5230) [ClassicSimilarity], result of:
            0.16317517 = score(doc=5230,freq=1.0), product of:
              0.2499318 = queryWeight, product of:
                1.2693086 = boost
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.023561982 = queryNorm
              0.65287876 = fieldWeight in 5230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.078125 = fieldNorm(doc=5230)
          0.08476303 = weight(abstract_txt:might in 5230) [ClassicSimilarity], result of:
            0.08476303 = score(doc=5230,freq=1.0), product of:
              0.20348534 = queryWeight, product of:
                1.6197127 = boost
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.023561982 = queryNorm
              0.41655594 = fieldWeight in 5230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.331916 = idf(docFreq=555, maxDocs=42306)
                0.078125 = fieldNorm(doc=5230)
        0.16 = coord(4/25)