Document (#32087)

Author
Koike, A.
Takagi, T.
Title
Knowledge discovery based on an implicit and explicit conceptual network
Source
Journal of the American Society for Information Science and Technology. 58(2007) no.1, S.51-65
Year
2007
Abstract
The amount of knowledge accumulated in published scientific papers has increased due to the continuing progress being made in scientific research. Since numerous papers have only reported fragments of scientific facts, there are possibilities for discovering new knowledge by connecting these facts. We therefore developed a system called BioTermNet to draft a conceptual network with hybrid methods of information extraction and information retrieval. Two concepts are regarded as related in this system if (a) their relationship is clearly described in MEDLINE abstracts or (b) they have distinctively co-occurred in abstracts. PRIME data, including protein interactions and functions extracted by NLP techniques, are used in the former, and the Singhalmeasure for information retrieval is used in the latter. Relationships that are not clearly or directly described in an abstract can be extracted by connecting multiple concepts. To evaluate how well this system performs, Swanson's association between Raynaud's disease and fish oil and that between migraine and magnesium were tested with abstracts that had been published before the discovery of these associations. The result was that when start and end concepts were given, plausible and understandable intermediate concepts connecting them could be detected. When only the start concept was given, not only the focused concept (magnesium and fish oil) but also other probable concepts could be detected as related concept candidates. Finally, this system was applied to find diseases related to the BRCA1 gene. Some other new potentially related diseases were detected along with diseases whose relations to BRCA1 were already known.
Footnote
The BioTermNet is available at http://btn.ontology.ims.u-tokyo.ac.jp.
Theme
Semantisches Umfeld in Indexierung u. Retrieval
Field
Biologie
Medizin
Object
BioTermNet

Similar documents (content)

  1. Weeber, M.; Klein, H.; Jong-van den Berg, L.T.W. de; Vos, R.: Using concepts in literature-based discovery : simulating Swanson's Raynaud-Fish Oil and Migraine-Manesium discoveries (2001) 0.33
    0.32848805 = sum of:
      0.32848805 = product of:
        1.0265251 = sum of:
          0.018135915 = weight(abstract_txt:that in 826) [ClassicSimilarity], result of:
            0.018135915 = score(doc=826,freq=3.0), product of:
              0.046640582 = queryWeight, product of:
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.019476924 = queryNorm
              0.38884407 = fieldWeight in 826, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.16778518 = weight(abstract_txt:swanson's in 826) [ClassicSimilarity], result of:
            0.16778518 = score(doc=826,freq=1.0), product of:
              0.18674995 = queryWeight, product of:
                1.0005027 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.019476924 = queryNorm
              0.89844835 = fieldWeight in 826, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.026448844 = weight(abstract_txt:knowledge in 826) [ClassicSimilarity], result of:
            0.026448844 = score(doc=826,freq=1.0), product of:
              0.07859645 = queryWeight, product of:
                1.1242169 = boost
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.019476924 = queryNorm
              0.33651447 = fieldWeight in 826, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.099013954 = weight(abstract_txt:discovery in 826) [ClassicSimilarity], result of:
            0.099013954 = score(doc=826,freq=2.0), product of:
              0.131388 = queryWeight, product of:
                1.1868091 = boost
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.019476924 = queryNorm
              0.7535997 = fieldWeight in 826, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.04102811 = weight(abstract_txt:system in 826) [ClassicSimilarity], result of:
            0.04102811 = score(doc=826,freq=2.0), product of:
              0.092007324 = queryWeight, product of:
                1.4045242 = boost
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.019476924 = queryNorm
              0.4459222 = fieldWeight in 826, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.3032171 = weight(abstract_txt:fish in 826) [ClassicSimilarity], result of:
            0.3032171 = score(doc=826,freq=1.0), product of:
              0.34908983 = queryWeight, product of:
                1.9345129 = boost
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.019476924 = queryNorm
              0.86859334 = fieldWeight in 826, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.24165127 = weight(abstract_txt:connecting in 826) [ClassicSimilarity], result of:
            0.24165127 = score(doc=826,freq=1.0), product of:
              0.34349826 = queryWeight, product of:
                2.350233 = boost
                7.5040073 = idf(docFreq=63, maxDocs=42740)
                0.019476924 = queryNorm
              0.7035007 = fieldWeight in 826, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5040073 = idf(docFreq=63, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
          0.12924467 = weight(abstract_txt:concepts in 826) [ClassicSimilarity], result of:
            0.12924467 = score(doc=826,freq=2.0), product of:
              0.21298377 = queryWeight, product of:
                2.3891656 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.019476924 = queryNorm
              0.60682875 = fieldWeight in 826, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.09375 = fieldNorm(doc=826)
        0.32 = coord(8/25)
    
  2. Gordon, M.D.; Lindsay, R.K.: Toward discovery support systems : a replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil (1996) 0.24
    0.2373739 = sum of:
      0.2373739 = product of:
        1.1868695 = sum of:
          0.02094155 = weight(abstract_txt:that in 4567) [ClassicSimilarity], result of:
            0.02094155 = score(doc=4567,freq=1.0), product of:
              0.046640582 = queryWeight, product of:
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.019476924 = queryNorm
              0.44899848 = fieldWeight in 4567, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.1875 = fieldNorm(doc=4567)
          0.33557037 = weight(abstract_txt:swanson's in 4567) [ClassicSimilarity], result of:
            0.33557037 = score(doc=4567,freq=1.0), product of:
              0.18674995 = queryWeight, product of:
                1.0005027 = boost
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.019476924 = queryNorm
              1.7968967 = fieldWeight in 4567, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.583449 = idf(docFreq=7, maxDocs=42740)
                0.1875 = fieldNorm(doc=4567)
          0.083896495 = weight(abstract_txt:could in 4567) [ClassicSimilarity], result of:
            0.083896495 = score(doc=4567,freq=1.0), product of:
              0.093377866 = queryWeight, product of:
                1.0005182 = boost
                4.791799 = idf(docFreq=963, maxDocs=42740)
                0.019476924 = queryNorm
              0.8984623 = fieldWeight in 4567, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.791799 = idf(docFreq=963, maxDocs=42740)
                0.1875 = fieldNorm(doc=4567)
          0.14002687 = weight(abstract_txt:discovery in 4567) [ClassicSimilarity], result of:
            0.14002687 = score(doc=4567,freq=1.0), product of:
              0.131388 = queryWeight, product of:
                1.1868091 = boost
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.019476924 = queryNorm
              1.0657508 = fieldWeight in 4567, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.1875 = fieldNorm(doc=4567)
          0.6064342 = weight(abstract_txt:fish in 4567) [ClassicSimilarity], result of:
            0.6064342 = score(doc=4567,freq=1.0), product of:
              0.34908983 = queryWeight, product of:
                1.9345129 = boost
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.019476924 = queryNorm
              1.7371867 = fieldWeight in 4567, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.1875 = fieldNorm(doc=4567)
        0.2 = coord(5/25)
    
  3. Zeng, M.L.; Kronenberg, F.; Molholt, P.: Toward a conceptual framework for complementary and alternative medicine : challenges and issues (2001) 0.18
    0.18457946 = sum of:
      0.18457946 = product of:
        0.57681084 = sum of:
          0.013961034 = weight(abstract_txt:that in 741) [ClassicSimilarity], result of:
            0.013961034 = score(doc=741,freq=4.0), product of:
              0.046640582 = queryWeight, product of:
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.019476924 = queryNorm
              0.29933232 = fieldWeight in 741, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.061202936 = weight(abstract_txt:conceptual in 741) [ClassicSimilarity], result of:
            0.061202936 = score(doc=741,freq=4.0), product of:
              0.09915709 = queryWeight, product of:
                1.0310148 = boost
                4.9378567 = idf(docFreq=832, maxDocs=42740)
                0.019476924 = queryNorm
              0.6172321 = fieldWeight in 741, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.9378567 = idf(docFreq=832, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.03054049 = weight(abstract_txt:knowledge in 741) [ClassicSimilarity], result of:
            0.03054049 = score(doc=741,freq=3.0), product of:
              0.07859645 = queryWeight, product of:
                1.1242169 = boost
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.019476924 = queryNorm
              0.3885734 = fieldWeight in 741, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.046675622 = weight(abstract_txt:discovery in 741) [ClassicSimilarity], result of:
            0.046675622 = score(doc=741,freq=1.0), product of:
              0.131388 = queryWeight, product of:
                1.1868091 = boost
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.019476924 = queryNorm
              0.3552503 = fieldWeight in 741, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.019340836 = weight(abstract_txt:system in 741) [ClassicSimilarity], result of:
            0.019340836 = score(doc=741,freq=1.0), product of:
              0.092007324 = queryWeight, product of:
                1.4045242 = boost
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.019476924 = queryNorm
              0.21020974 = fieldWeight in 741, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.050348073 = weight(abstract_txt:concept in 741) [ClassicSimilarity], result of:
            0.050348073 = score(doc=741,freq=2.0), product of:
              0.12555613 = queryWeight, product of:
                1.4209133 = boost
                4.5368032 = idf(docFreq=1243, maxDocs=42740)
                0.019476924 = queryNorm
              0.40100053 = fieldWeight in 741, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5368032 = idf(docFreq=1243, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.12185305 = weight(abstract_txt:concepts in 741) [ClassicSimilarity], result of:
            0.12185305 = score(doc=741,freq=4.0), product of:
              0.21298377 = queryWeight, product of:
                2.3891656 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.019476924 = queryNorm
              0.57212365 = fieldWeight in 741, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
          0.23288876 = weight(abstract_txt:diseases in 741) [ClassicSimilarity], result of:
            0.23288876 = score(doc=741,freq=1.0), product of:
              0.43916228 = queryWeight, product of:
                2.6574259 = boost
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.019476924 = queryNorm
              0.5303023 = fieldWeight in 741, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.0625 = fieldNorm(doc=741)
        0.32 = coord(8/25)
    
  4. Eijk, C.C. van der; Mulligen, E.M. van; Kors, J.A.; Mons, B.; Berg, J. van den: Constructing an associative concept space for literature-based discovery (2004) 0.16
    0.16121334 = sum of:
      0.16121334 = product of:
        0.5037917 = sum of:
          0.008725647 = weight(abstract_txt:that in 3229) [ClassicSimilarity], result of:
            0.008725647 = score(doc=3229,freq=1.0), product of:
              0.046640582 = queryWeight, product of:
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.019476924 = queryNorm
              0.18708271 = fieldWeight in 3229, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.03680714 = weight(abstract_txt:only in 3229) [ClassicSimilarity], result of:
            0.03680714 = score(doc=3229,freq=1.0), product of:
              0.11063028 = queryWeight, product of:
                1.3337845 = boost
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.019476924 = queryNorm
              0.332704 = fieldWeight in 3229, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.024176046 = weight(abstract_txt:system in 3229) [ClassicSimilarity], result of:
            0.024176046 = score(doc=3229,freq=1.0), product of:
              0.092007324 = queryWeight, product of:
                1.4045242 = boost
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.019476924 = queryNorm
              0.2627622 = fieldWeight in 3229, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3633559 = idf(docFreq=4021, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.06293509 = weight(abstract_txt:concept in 3229) [ClassicSimilarity], result of:
            0.06293509 = score(doc=3229,freq=2.0), product of:
              0.12555613 = queryWeight, product of:
                1.4209133 = boost
                4.5368032 = idf(docFreq=1243, maxDocs=42740)
                0.019476924 = queryNorm
              0.5012507 = fieldWeight in 3229, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5368032 = idf(docFreq=1243, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.069250494 = weight(abstract_txt:scientific in 3229) [ClassicSimilarity], result of:
            0.069250494 = score(doc=3229,freq=2.0), product of:
              0.1338211 = queryWeight, product of:
                1.4669353 = boost
                4.6837454 = idf(docFreq=1073, maxDocs=42740)
                0.019476924 = queryNorm
              0.5174856 = fieldWeight in 3229, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6837454 = idf(docFreq=1073, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.04839189 = weight(abstract_txt:related in 3229) [ClassicSimilarity], result of:
            0.04839189 = score(doc=3229,freq=1.0), product of:
              0.14613266 = queryWeight, product of:
                1.770075 = boost
                4.238725 = idf(docFreq=1675, maxDocs=42740)
                0.019476924 = queryNorm
              0.3311504 = fieldWeight in 3229, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.238725 = idf(docFreq=1675, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.10118904 = weight(abstract_txt:abstracts in 3229) [ClassicSimilarity], result of:
            0.10118904 = score(doc=3229,freq=1.0), product of:
              0.21710756 = queryWeight, product of:
                1.86847 = boost
                5.965797 = idf(docFreq=297, maxDocs=42740)
                0.019476924 = queryNorm
              0.4660779 = fieldWeight in 3229, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.965797 = idf(docFreq=297, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
          0.1523163 = weight(abstract_txt:concepts in 3229) [ClassicSimilarity], result of:
            0.1523163 = score(doc=3229,freq=4.0), product of:
              0.21298377 = queryWeight, product of:
                2.3891656 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.019476924 = queryNorm
              0.7151545 = fieldWeight in 3229, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.078125 = fieldNorm(doc=3229)
        0.32 = coord(8/25)
    
  5. Loh, S.; Oliveira, J.P.M. de; Gastal, F.L.: Knowledge discovery in textual documentation : qualitative and quantitative analyses (2001) 0.14
    0.14436626 = sum of:
      0.14436626 = product of:
        0.7218313 = sum of:
          0.0440814 = weight(abstract_txt:knowledge in 483) [ClassicSimilarity], result of:
            0.0440814 = score(doc=483,freq=4.0), product of:
              0.07859645 = queryWeight, product of:
                1.1242169 = boost
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.019476924 = queryNorm
              0.5608574 = fieldWeight in 483, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5894876 = idf(docFreq=3207, maxDocs=42740)
                0.078125 = fieldNorm(doc=483)
          0.08251163 = weight(abstract_txt:discovery in 483) [ClassicSimilarity], result of:
            0.08251163 = score(doc=483,freq=2.0), product of:
              0.131388 = queryWeight, product of:
                1.1868091 = boost
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.019476924 = queryNorm
              0.6279998 = fieldWeight in 483, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.684005 = idf(docFreq=394, maxDocs=42740)
                0.078125 = fieldNorm(doc=483)
          0.075841375 = weight(abstract_txt:extracted in 483) [ClassicSimilarity], result of:
            0.075841375 = score(doc=483,freq=1.0), product of:
              0.15649232 = queryWeight, product of:
                1.2952379 = boost
                6.2033052 = idf(docFreq=234, maxDocs=42740)
                0.019476924 = queryNorm
              0.4846332 = fieldWeight in 483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2033052 = idf(docFreq=234, maxDocs=42740)
                0.078125 = fieldNorm(doc=483)
          0.1077039 = weight(abstract_txt:concepts in 483) [ClassicSimilarity], result of:
            0.1077039 = score(doc=483,freq=2.0), product of:
              0.21298377 = queryWeight, product of:
                2.3891656 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.019476924 = queryNorm
              0.50569063 = fieldWeight in 483, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.078125 = fieldNorm(doc=483)
          0.41169304 = weight(abstract_txt:diseases in 483) [ClassicSimilarity], result of:
            0.41169304 = score(doc=483,freq=2.0), product of:
              0.43916228 = queryWeight, product of:
                2.6574259 = boost
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.019476924 = queryNorm
              0.9374508 = fieldWeight in 483, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.484837 = idf(docFreq=23, maxDocs=42740)
                0.078125 = fieldNorm(doc=483)
        0.2 = coord(5/25)