Document (#35111)

Author
Gelernter, J.
Title
Image indexing in article component databases
Source
Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.1965-1976
Year
2009
Abstract
It is often necessary to compare data-rich charts, tables, diagrams, or drawings rather than the articles that contextualize that data. The objective of this research has been to create a database of non-textual components (here, maps) that are searchable independently of the articles from which they are taken, with the option to view the source articles. The method mines words from the articles that are near or associated with each component map, and these mined words become the basis of region, time, and subject indexing. The evaluation showed that automatic indexing of the component maps by these three facets works well, and indicates that a large-scale component database following this model is viable.

Similar documents (content)

  1. Catarci, T.; Spaccapietra, S.: Visual information querying (2002) 0.19
    0.18545407 = sum of:
      0.18545407 = product of:
        0.42148653 = sum of:
          0.037819225 = weight(abstract_txt:textual in 4268) [ClassicSimilarity], result of:
            0.037819225 = score(doc=4268,freq=3.0), product of:
              0.11703744 = queryWeight, product of:
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.019604132 = queryNorm
              0.32313785 = fieldWeight in 4268, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.024699053 = weight(abstract_txt:facets in 4268) [ClassicSimilarity], result of:
            0.024699053 = score(doc=4268,freq=1.0), product of:
              0.12706043 = queryWeight, product of:
                1.0419401 = boost
                6.2204237 = idf(docFreq=238, maxDocs=44218)
                0.019604132 = queryNorm
              0.19438824 = fieldWeight in 4268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2204237 = idf(docFreq=238, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.009405312 = weight(abstract_txt:these in 4268) [ClassicSimilarity], result of:
            0.009405312 = score(doc=4268,freq=2.0), product of:
              0.06675323 = queryWeight, product of:
                1.068043 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019604132 = queryNorm
              0.14089674 = fieldWeight in 4268, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.020165704 = weight(abstract_txt:data in 4268) [ClassicSimilarity], result of:
            0.020165704 = score(doc=4268,freq=7.0), product of:
              0.07310432 = queryWeight, product of:
                1.1176971 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019604132 = queryNorm
              0.27584833 = fieldWeight in 4268, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.034872722 = weight(abstract_txt:near in 4268) [ClassicSimilarity], result of:
            0.034872722 = score(doc=4268,freq=1.0), product of:
              0.15991187 = queryWeight, product of:
                1.1689016 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.019604132 = queryNorm
              0.21807463 = fieldWeight in 4268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.090433806 = weight(abstract_txt:charts in 4268) [ClassicSimilarity], result of:
            0.090433806 = score(doc=4268,freq=3.0), product of:
              0.20928442 = queryWeight, product of:
                1.3372298 = boost
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.019604132 = queryNorm
              0.4321096 = fieldWeight in 4268, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.063439645 = weight(abstract_txt:drawings in 4268) [ClassicSimilarity], result of:
            0.063439645 = score(doc=4268,freq=1.0), product of:
              0.23830362 = queryWeight, product of:
                1.4269308 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.019604132 = queryNorm
              0.26621354 = fieldWeight in 4268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.03217982 = weight(abstract_txt:database in 4268) [ClassicSimilarity], result of:
            0.03217982 = score(doc=4268,freq=4.0), product of:
              0.12030099 = queryWeight, product of:
                1.4337955 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.019604132 = queryNorm
              0.26749423 = fieldWeight in 4268, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.041910093 = weight(abstract_txt:maps in 4268) [ClassicSimilarity], result of:
            0.041910093 = score(doc=4268,freq=1.0), product of:
              0.22774343 = queryWeight, product of:
                1.9727658 = boost
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.019604132 = queryNorm
              0.18402328 = fieldWeight in 4268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.02167091 = weight(abstract_txt:that in 4268) [ClassicSimilarity], result of:
            0.02167091 = score(doc=4268,freq=7.0), product of:
              0.11061804 = queryWeight, product of:
                2.3813663 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019604132 = queryNorm
              0.19590755 = fieldWeight in 4268, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
          0.04489022 = weight(abstract_txt:articles in 4268) [ClassicSimilarity], result of:
            0.04489022 = score(doc=4268,freq=1.0), product of:
              0.3003848 = queryWeight, product of:
                3.2041037 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.019604132 = queryNorm
              0.14944239 = fieldWeight in 4268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.03125 = fieldNorm(doc=4268)
        0.44 = coord(11/25)
    
  2. Kostoff, R.N.; Rio, J.A. del; Humenik, J.A.; Garcia, E.O.; Ramirez, A.M.: Citation mining : integrating text mining and bibliometrics for research user profiling (2001) 0.11
    0.114086464 = sum of:
      0.114086464 = product of:
        0.47536027 = sum of:
          0.014107969 = weight(abstract_txt:these in 6850) [ClassicSimilarity], result of:
            0.014107969 = score(doc=6850,freq=2.0), product of:
              0.06675323 = queryWeight, product of:
                1.068043 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019604132 = queryNorm
              0.2113451 = fieldWeight in 6850, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
          0.06253715 = weight(abstract_txt:independently in 6850) [ClassicSimilarity], result of:
            0.06253715 = score(doc=6850,freq=1.0), product of:
              0.18013081 = queryWeight, product of:
                1.2405995 = boost
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.019604132 = queryNorm
              0.3471763 = fieldWeight in 6850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
          0.024134867 = weight(abstract_txt:database in 6850) [ClassicSimilarity], result of:
            0.024134867 = score(doc=6850,freq=1.0), product of:
              0.12030099 = queryWeight, product of:
                1.4337955 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.019604132 = queryNorm
              0.20062068 = fieldWeight in 6850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
          0.017375384 = weight(abstract_txt:that in 6850) [ClassicSimilarity], result of:
            0.017375384 = score(doc=6850,freq=2.0), product of:
              0.11061804 = queryWeight, product of:
                2.3813663 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019604132 = queryNorm
              0.1570755 = fieldWeight in 6850, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
          0.22332604 = weight(abstract_txt:articles in 6850) [ClassicSimilarity], result of:
            0.22332604 = score(doc=6850,freq=11.0), product of:
              0.3003848 = queryWeight, product of:
                3.2041037 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.019604132 = queryNorm
              0.74346656 = fieldWeight in 6850, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
          0.13387884 = weight(abstract_txt:component in 6850) [ClassicSimilarity], result of:
            0.13387884 = score(doc=6850,freq=1.0), product of:
              0.47496024 = queryWeight, product of:
                4.0289903 = boost
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.019604132 = queryNorm
              0.2818738 = fieldWeight in 6850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.046875 = fieldNorm(doc=6850)
        0.24 = coord(6/25)
    
  3. Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: FACET: thesaurus retrieval with semantic term expansion (2002) 0.11
    0.10865222 = sum of:
      0.10865222 = product of:
        0.38804364 = sum of:
          0.043223344 = weight(abstract_txt:facets in 175) [ClassicSimilarity], result of:
            0.043223344 = score(doc=175,freq=1.0), product of:
              0.12706043 = queryWeight, product of:
                1.0419401 = boost
                6.2204237 = idf(docFreq=238, maxDocs=44218)
                0.019604132 = queryNorm
              0.3401794 = fieldWeight in 175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2204237 = idf(docFreq=238, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.01333836 = weight(abstract_txt:data in 175) [ClassicSimilarity], result of:
            0.01333836 = score(doc=175,freq=1.0), product of:
              0.07310432 = queryWeight, product of:
                1.1176971 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019604132 = queryNorm
              0.18245652 = fieldWeight in 175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.053519413 = weight(abstract_txt:tables in 175) [ClassicSimilarity], result of:
            0.053519413 = score(doc=175,freq=1.0), product of:
              0.1465117 = queryWeight, product of:
                1.1188549 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.019604132 = queryNorm
              0.36529103 = fieldWeight in 175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.048769947 = weight(abstract_txt:database in 175) [ClassicSimilarity], result of:
            0.048769947 = score(doc=175,freq=3.0), product of:
              0.12030099 = queryWeight, product of:
                1.4337955 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.019604132 = queryNorm
              0.40539938 = fieldWeight in 175, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.044332676 = weight(abstract_txt:indexing in 175) [ClassicSimilarity], result of:
            0.044332676 = score(doc=175,freq=1.0), product of:
              0.18637505 = queryWeight, product of:
                2.1857078 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.019604132 = queryNorm
              0.23786807 = fieldWeight in 175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.028667921 = weight(abstract_txt:that in 175) [ClassicSimilarity], result of:
            0.028667921 = score(doc=175,freq=4.0), product of:
              0.11061804 = queryWeight, product of:
                2.3813663 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019604132 = queryNorm
              0.25916135 = fieldWeight in 175, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
          0.15619199 = weight(abstract_txt:component in 175) [ClassicSimilarity], result of:
            0.15619199 = score(doc=175,freq=1.0), product of:
              0.47496024 = queryWeight, product of:
                4.0289903 = boost
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.019604132 = queryNorm
              0.32885277 = fieldWeight in 175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.0546875 = fieldNorm(doc=175)
        0.28 = coord(7/25)
    
  4. Keyser, P. de: Indexing : from thesauri to the Semantic Web (2012) 0.11
    0.10529494 = sum of:
      0.10529494 = product of:
        0.5264747 = sum of:
          0.0166264 = weight(abstract_txt:these in 3197) [ClassicSimilarity], result of:
            0.0166264 = score(doc=3197,freq=1.0), product of:
              0.06675323 = queryWeight, product of:
                1.068043 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019604132 = queryNorm
              0.24907261 = fieldWeight in 3197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.078125 = fieldNorm(doc=3197)
          0.10422858 = weight(abstract_txt:independently in 3197) [ClassicSimilarity], result of:
            0.10422858 = score(doc=3197,freq=1.0), product of:
              0.18013081 = queryWeight, product of:
                1.2405995 = boost
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.019604132 = queryNorm
              0.57862717 = fieldWeight in 3197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.078125 = fieldNorm(doc=3197)
          0.14817455 = weight(abstract_txt:maps in 3197) [ClassicSimilarity], result of:
            0.14817455 = score(doc=3197,freq=2.0), product of:
              0.22774343 = queryWeight, product of:
                1.9727658 = boost
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.019604132 = queryNorm
              0.6506205 = fieldWeight in 3197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.078125 = fieldNorm(doc=3197)
          0.2369681 = weight(abstract_txt:indexing in 3197) [ClassicSimilarity], result of:
            0.2369681 = score(doc=3197,freq=14.0), product of:
              0.18637505 = queryWeight, product of:
                2.1857078 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.019604132 = queryNorm
              1.2714583 = fieldWeight in 3197, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=3197)
          0.020477086 = weight(abstract_txt:that in 3197) [ClassicSimilarity], result of:
            0.020477086 = score(doc=3197,freq=1.0), product of:
              0.11061804 = queryWeight, product of:
                2.3813663 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019604132 = queryNorm
              0.18511525 = fieldWeight in 3197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=3197)
        0.2 = coord(5/25)
    
  5. Jascó, P.: CD-ROM databases with full-page images (1998) 0.10
    0.10416086 = sum of:
      0.10416086 = product of:
        0.5208043 = sum of:
          0.0764563 = weight(abstract_txt:tables in 1890) [ClassicSimilarity], result of:
            0.0764563 = score(doc=1890,freq=1.0), product of:
              0.1465117 = queryWeight, product of:
                1.1188549 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.019604132 = queryNorm
              0.5218443 = fieldWeight in 1890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.078125 = fieldNorm(doc=1890)
          0.13463007 = weight(abstract_txt:searchable in 1890) [ClassicSimilarity], result of:
            0.13463007 = score(doc=1890,freq=2.0), product of:
              0.16956966 = queryWeight, product of:
                1.2036817 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.019604132 = queryNorm
              0.7939514 = fieldWeight in 1890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.078125 = fieldNorm(doc=1890)
          0.13052997 = weight(abstract_txt:charts in 1890) [ClassicSimilarity], result of:
            0.13052997 = score(doc=1890,freq=1.0), product of:
              0.20928442 = queryWeight, product of:
                1.3372298 = boost
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.019604132 = queryNorm
              0.6236965 = fieldWeight in 1890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.078125 = fieldNorm(doc=1890)
          0.020477086 = weight(abstract_txt:that in 1890) [ClassicSimilarity], result of:
            0.020477086 = score(doc=1890,freq=1.0), product of:
              0.11061804 = queryWeight, product of:
                2.3813663 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019604132 = queryNorm
              0.18511525 = fieldWeight in 1890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=1890)
          0.1587109 = weight(abstract_txt:articles in 1890) [ClassicSimilarity], result of:
            0.1587109 = score(doc=1890,freq=2.0), product of:
              0.3003848 = queryWeight, product of:
                3.2041037 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.019604132 = queryNorm
              0.52835864 = fieldWeight in 1890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.078125 = fieldNorm(doc=1890)
        0.2 = coord(5/25)