Document (#32086)

Author
Koike, A.
Takagi, T.
Title
Knowledge discovery based on an implicit and explicit conceptual network
Source
Journal of the American Society for Information Science and Technology. 58(2007) no.1, S.51-65
Year
2007
Abstract
The amount of knowledge accumulated in published scientific papers has increased due to the continuing progress being made in scientific research. Since numerous papers have only reported fragments of scientific facts, there are possibilities for discovering new knowledge by connecting these facts. We therefore developed a system called BioTermNet to draft a conceptual network with hybrid methods of information extraction and information retrieval. Two concepts are regarded as related in this system if (a) their relationship is clearly described in MEDLINE abstracts or (b) they have distinctively co-occurred in abstracts. PRIME data, including protein interactions and functions extracted by NLP techniques, are used in the former, and the Singhalmeasure for information retrieval is used in the latter. Relationships that are not clearly or directly described in an abstract can be extracted by connecting multiple concepts. To evaluate how well this system performs, Swanson's association between Raynaud's disease and fish oil and that between migraine and magnesium were tested with abstracts that had been published before the discovery of these associations. The result was that when start and end concepts were given, plausible and understandable intermediate concepts connecting them could be detected. When only the start concept was given, not only the focused concept (magnesium and fish oil) but also other probable concepts could be detected as related concept candidates. Finally, this system was applied to find diseases related to the BRCA1 gene. Some other new potentially related diseases were detected along with diseases whose relations to BRCA1 were already known.
Footnote
The BioTermNet is available at http://btn.ontology.ims.u-tokyo.ac.jp.
Theme
Semantisches Umfeld in Indexierung u. Retrieval
Field
Biologie
Medizin
Object
BioTermNet

Similar documents (content)

  1. Weeber, M.; Klein, H.; Jong-van den Berg, L.T.W. de; Vos, R.: Using concepts in literature-based discovery : simulating Swanson's Raynaud-Fish Oil and Migraine-Manesium discoveries (2001) 0.33
    0.3283194 = sum of:
      0.3283194 = product of:
        1.0259981 = sum of:
          0.017650198 = weight(abstract_txt:that in 5910) [ClassicSimilarity], result of:
            0.017650198 = score(doc=5910,freq=3.0), product of:
              0.045873888 = queryWeight, product of:
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019360358 = queryNorm
              0.38475478 = fieldWeight in 5910, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.17035379 = weight(abstract_txt:swanson's in 5910) [ClassicSimilarity], result of:
            0.17035379 = score(doc=5910,freq=1.0), product of:
              0.18893863 = queryWeight, product of:
                1.0147233 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.019360358 = queryNorm
              0.9016355 = fieldWeight in 5910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.025763325 = weight(abstract_txt:knowledge in 5910) [ClassicSimilarity], result of:
            0.025763325 = score(doc=5910,freq=1.0), product of:
              0.07735017 = queryWeight, product of:
                1.1245493 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019360358 = queryNorm
              0.33307394 = fieldWeight in 5910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.096106224 = weight(abstract_txt:discovery in 5910) [ClassicSimilarity], result of:
            0.096106224 = score(doc=5910,freq=2.0), product of:
              0.12899926 = queryWeight, product of:
                1.1857574 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019360358 = queryNorm
              0.7450137 = fieldWeight in 5910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.041546017 = weight(abstract_txt:system in 5910) [ClassicSimilarity], result of:
            0.041546017 = score(doc=5910,freq=2.0), product of:
              0.09292141 = queryWeight, product of:
                1.4232302 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.019360358 = queryNorm
              0.44710916 = fieldWeight in 5910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.3079712 = weight(abstract_txt:fish in 5910) [ClassicSimilarity], result of:
            0.3079712 = score(doc=5910,freq=1.0), product of:
              0.35326692 = queryWeight, product of:
                1.9622473 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.019360358 = queryNorm
              0.8717805 = fieldWeight in 5910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.23877887 = weight(abstract_txt:connecting in 5910) [ClassicSimilarity], result of:
            0.23877887 = score(doc=5910,freq=1.0), product of:
              0.34129027 = queryWeight, product of:
                2.3621628 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.019360358 = queryNorm
              0.69963574 = fieldWeight in 5910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
          0.12782846 = weight(abstract_txt:concepts in 5910) [ClassicSimilarity], result of:
            0.12782846 = score(doc=5910,freq=2.0), product of:
              0.21174732 = queryWeight, product of:
                2.4020452 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.019360358 = queryNorm
              0.60368395 = fieldWeight in 5910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.09375 = fieldNorm(doc=5910)
        0.32 = coord(8/25)
    
  2. Gordon, M.D.; Lindsay, R.K.: Toward discovery support systems : a replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil (1996) 0.24
    0.23924004 = sum of:
      0.23924004 = product of:
        1.1962001 = sum of:
          0.020380694 = weight(abstract_txt:that in 4498) [ClassicSimilarity], result of:
            0.020380694 = score(doc=4498,freq=1.0), product of:
              0.045873888 = queryWeight, product of:
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019360358 = queryNorm
              0.44427657 = fieldWeight in 4498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.1875 = fieldNorm(doc=4498)
          0.08325472 = weight(abstract_txt:could in 4498) [ClassicSimilarity], result of:
            0.08325472 = score(doc=4498,freq=1.0), product of:
              0.093042664 = queryWeight, product of:
                1.007032 = boost
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.019360358 = queryNorm
              0.89480156 = fieldWeight in 4498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.1875 = fieldNorm(doc=4498)
          0.34070757 = weight(abstract_txt:swanson's in 4498) [ClassicSimilarity], result of:
            0.34070757 = score(doc=4498,freq=1.0), product of:
              0.18893863 = queryWeight, product of:
                1.0147233 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.019360358 = queryNorm
              1.803271 = fieldWeight in 4498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.1875 = fieldNorm(doc=4498)
          0.13591471 = weight(abstract_txt:discovery in 4498) [ClassicSimilarity], result of:
            0.13591471 = score(doc=4498,freq=1.0), product of:
              0.12899926 = queryWeight, product of:
                1.1857574 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019360358 = queryNorm
              1.0536084 = fieldWeight in 4498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.1875 = fieldNorm(doc=4498)
          0.6159424 = weight(abstract_txt:fish in 4498) [ClassicSimilarity], result of:
            0.6159424 = score(doc=4498,freq=1.0), product of:
              0.35326692 = queryWeight, product of:
                1.9622473 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.019360358 = queryNorm
              1.743561 = fieldWeight in 4498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.1875 = fieldNorm(doc=4498)
        0.2 = coord(5/25)
    
  3. Zeng, M.L.; Kronenberg, F.; Molholt, P.: Toward a conceptual framework for complementary and alternative medicine : challenges and issues (2001) 0.18
    0.18285696 = sum of:
      0.18285696 = product of:
        0.571428 = sum of:
          0.013587129 = weight(abstract_txt:that in 6740) [ClassicSimilarity], result of:
            0.013587129 = score(doc=6740,freq=4.0), product of:
              0.045873888 = queryWeight, product of:
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019360358 = queryNorm
              0.2961844 = fieldWeight in 6740, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.059755534 = weight(abstract_txt:conceptual in 6740) [ClassicSimilarity], result of:
            0.059755534 = score(doc=6740,freq=4.0), product of:
              0.097736284 = queryWeight, product of:
                1.0321199 = boost
                4.891165 = idf(docFreq=902, maxDocs=44218)
                0.019360358 = queryNorm
              0.6113956 = fieldWeight in 6740, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.891165 = idf(docFreq=902, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.029748928 = weight(abstract_txt:knowledge in 6740) [ClassicSimilarity], result of:
            0.029748928 = score(doc=6740,freq=3.0), product of:
              0.07735017 = queryWeight, product of:
                1.1245493 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019360358 = queryNorm
              0.38460067 = fieldWeight in 6740, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.045304906 = weight(abstract_txt:discovery in 6740) [ClassicSimilarity], result of:
            0.045304906 = score(doc=6740,freq=1.0), product of:
              0.12899926 = queryWeight, product of:
                1.1857574 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019360358 = queryNorm
              0.35120282 = fieldWeight in 6740, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.01958498 = weight(abstract_txt:system in 6740) [ClassicSimilarity], result of:
            0.01958498 = score(doc=6740,freq=1.0), product of:
              0.09292141 = queryWeight, product of:
                1.4232302 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.019360358 = queryNorm
              0.21076928 = fieldWeight in 6740, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.04953753 = weight(abstract_txt:concept in 6740) [ClassicSimilarity], result of:
            0.04953753 = score(doc=6740,freq=2.0), product of:
              0.124394275 = queryWeight, product of:
                1.4260937 = boost
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.019360358 = queryNorm
              0.39822996 = fieldWeight in 6740, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.12051782 = weight(abstract_txt:concepts in 6740) [ClassicSimilarity], result of:
            0.12051782 = score(doc=6740,freq=4.0), product of:
              0.21174732 = queryWeight, product of:
                2.4020452 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.019360358 = queryNorm
              0.5691587 = fieldWeight in 6740, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
          0.23339118 = weight(abstract_txt:diseases in 6740) [ClassicSimilarity], result of:
            0.23339118 = score(doc=6740,freq=1.0), product of:
              0.44046402 = queryWeight, product of:
                2.683509 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.019360358 = queryNorm
              0.5298757 = fieldWeight in 6740, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.0625 = fieldNorm(doc=6740)
        0.32 = coord(8/25)
    
  4. Eijk, C.C. van der; Mulligen, E.M. van; Kors, J.A.; Mons, B.; Berg, J. van den: Constructing an associative concept space for literature-based discovery (2004) 0.16
    0.15954961 = sum of:
      0.15954961 = product of:
        0.49859256 = sum of:
          0.008491956 = weight(abstract_txt:that in 2228) [ClassicSimilarity], result of:
            0.008491956 = score(doc=2228,freq=1.0), product of:
              0.045873888 = queryWeight, product of:
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.019360358 = queryNorm
              0.18511525 = fieldWeight in 2228, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.03635565 = weight(abstract_txt:only in 2228) [ClassicSimilarity], result of:
            0.03635565 = score(doc=2228,freq=1.0), product of:
              0.10989098 = queryWeight, product of:
                1.3403829 = boost
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.019360358 = queryNorm
              0.33083376 = fieldWeight in 2228, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.024481224 = weight(abstract_txt:system in 2228) [ClassicSimilarity], result of:
            0.024481224 = score(doc=2228,freq=1.0), product of:
              0.09292141 = queryWeight, product of:
                1.4232302 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.019360358 = queryNorm
              0.2634616 = fieldWeight in 2228, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.06192191 = weight(abstract_txt:concept in 2228) [ClassicSimilarity], result of:
            0.06192191 = score(doc=2228,freq=2.0), product of:
              0.124394275 = queryWeight, product of:
                1.4260937 = boost
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.019360358 = queryNorm
              0.49778745 = fieldWeight in 2228, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.06770542 = weight(abstract_txt:scientific in 2228) [ClassicSimilarity], result of:
            0.06770542 = score(doc=2228,freq=2.0), product of:
              0.13202406 = queryWeight, product of:
                1.469178 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.019360358 = queryNorm
              0.5128264 = fieldWeight in 2228, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.04745166 = weight(abstract_txt:related in 2228) [ClassicSimilarity], result of:
            0.04745166 = score(doc=2228,freq=1.0), product of:
              0.14445347 = queryWeight, product of:
                1.7745214 = boost
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.019360358 = queryNorm
              0.32849097 = fieldWeight in 2228, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.10153744 = weight(abstract_txt:abstracts in 2228) [ClassicSimilarity], result of:
            0.10153744 = score(doc=2228,freq=1.0), product of:
              0.21793732 = queryWeight, product of:
                1.887616 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.019360358 = queryNorm
              0.46590203 = fieldWeight in 2228, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
          0.15064727 = weight(abstract_txt:concepts in 2228) [ClassicSimilarity], result of:
            0.15064727 = score(doc=2228,freq=4.0), product of:
              0.21174732 = queryWeight, product of:
                2.4020452 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.019360358 = queryNorm
              0.7114483 = fieldWeight in 2228, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.078125 = fieldNorm(doc=2228)
        0.32 = coord(8/25)
    
  5. Loh, S.; Oliveira, J.P.M. de; Gastal, F.L.: Knowledge discovery in textual documentation : qualitative and quantitative analyses (2001) 0.14
    0.14334407 = sum of:
      0.14334407 = product of:
        0.71672034 = sum of:
          0.042938873 = weight(abstract_txt:knowledge in 4482) [ClassicSimilarity], result of:
            0.042938873 = score(doc=4482,freq=4.0), product of:
              0.07735017 = queryWeight, product of:
                1.1245493 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019360358 = queryNorm
              0.5551232 = fieldWeight in 4482, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.078125 = fieldNorm(doc=4482)
          0.08008851 = weight(abstract_txt:discovery in 4482) [ClassicSimilarity], result of:
            0.08008851 = score(doc=4482,freq=2.0), product of:
              0.12899926 = queryWeight, product of:
                1.1857574 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019360358 = queryNorm
              0.6208447 = fieldWeight in 4482, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.078125 = fieldNorm(doc=4482)
          0.074587986 = weight(abstract_txt:extracted in 4482) [ClassicSimilarity], result of:
            0.074587986 = score(doc=4482,freq=1.0), product of:
              0.15499927 = queryWeight, product of:
                1.2997717 = boost
                6.159553 = idf(docFreq=253, maxDocs=44218)
                0.019360358 = queryNorm
              0.4812151 = fieldWeight in 4482, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.159553 = idf(docFreq=253, maxDocs=44218)
                0.078125 = fieldNorm(doc=4482)
          0.10652371 = weight(abstract_txt:concepts in 4482) [ClassicSimilarity], result of:
            0.10652371 = score(doc=4482,freq=2.0), product of:
              0.21174732 = queryWeight, product of:
                2.4020452 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.019360358 = queryNorm
              0.50306994 = fieldWeight in 4482, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.078125 = fieldNorm(doc=4482)
          0.41258124 = weight(abstract_txt:diseases in 4482) [ClassicSimilarity], result of:
            0.41258124 = score(doc=4482,freq=2.0), product of:
              0.44046402 = queryWeight, product of:
                2.683509 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.019360358 = queryNorm
              0.93669677 = fieldWeight in 4482, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.078125 = fieldNorm(doc=4482)
        0.2 = coord(5/25)