Search (1 results, page 1 of 1)

  • × author_ss:"Wang, P."
  • × author_ss:"Yan, J."
  • × language_ss:"e"
  1. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.01
    0.0062868367 = product of:
      0.01886051 = sum of:
        0.01886051 = weight(_text_:on in 3927) [ClassicSimilarity], result of:
          0.01886051 = score(doc=3927,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 3927, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3927)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic extraction of large-scale and accurate drug-disease pairs from the medical literature plays an important role for drug repurposing. However, many existing extraction methods are mainly in a supervised manner. It is costly and time-consuming to manually label drug-disease pairs datasets. There are many drug-disease pairs buried in free text. In this work, we first leverage a pattern-based method to automatically extract drug-disease pairs with treatment and inducement relationships from free text. Then, to reflect a drug-disease relation, a network embedding algorithm is proposed to calculate the degree of correlation of a drug-disease pair. In the experiments, we use the method to extract treatment and inducement drug-disease pairs from 27 million medical abstracts and titles available on PubMed. We extract 138,318 unique treatment pairs and 75,396 unique inducement pairs. Our algorithm achieves a precision of 0.912 and a recall of 0.898 in extracting the frequent treatment drug-disease pairs, and a precision of 0.923 and a recall of 0.833 in extracting the frequent inducement drug-disease pairs. Besides, our proposed information network embedding algorithm can efficiently reflect the degree of correlation of drug-disease pairs. Our algorithm can achieve a precision of 0.802, a recall of 0.783 in the fine-grained evaluation of extracting frequent pairs.
    Footnote
    Beitrag in einem Special issue on biomedical information retrieval.