Document (#40958)

Author
Soldaini, L.
Yates, A.
Goharian, N.
Title
Learning to reformulate long queries for clinical decision support
Source
Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2602-2619
Year
2017
Abstract
The large volume of biomedical literature poses a serious problem for medical professionals, who are often struggling to keep current with it. At the same time, many health providers consider knowledge of the latest literature in their field a key component for successful clinical practice. In this work, we introduce two systems designed to help retrieving medical literature. Both receive a long, discursive clinical note as input query, and return highly relevant literature that could be used in support of clinical practice. The first system is an improved version of a method previously proposed by the authors; it combines pseudo relevance feedback and a domain-specific term filter to reformulate the query. The second is an approach that uses a deep neural network to reformulate a clinical note. Both approaches were evaluated on the 2014 and 2015 TREC CDS datasets; in our tests, they outperform the previously proposed method by up to 28% in inferred NDCG; furthermore, they are competitive with the state of the art, achieving up to 8% improvement in inferred NDCG.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23924/full.
Footnote
Beitrag in einem Special issue on biomedical information retrieval.
Field
Medizin

Similar documents (author)

  1. Cohan, A.; Young, S.; Yates, A.; Goharian, N.: Triaging content severity in online mental health forums (2017) 4.07
    4.0712194 = sum of:
      4.0712194 = sum of:
        1.7681984 = weight(author_txt:yates in 3930) [ClassicSimilarity], result of:
          1.7681984 = score(doc=3930,freq=1.0), product of:
            0.6425055 = queryWeight, product of:
              8.806516 = idf(docFreq=17, maxDocs=44218)
              0.07295797 = queryNorm
            2.752036 = fieldWeight in 3930, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.806516 = idf(docFreq=17, maxDocs=44218)
              0.3125 = fieldNorm(doc=3930)
        2.3030212 = weight(author_txt:goharian in 3930) [ClassicSimilarity], result of:
          2.3030212 = score(doc=3930,freq=1.0), product of:
            0.7662811 = queryWeight, product of:
              1.092083 = boost
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.07295797 = queryNorm
            3.005452 = fieldWeight in 3930, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.3125 = fieldNorm(doc=3930)
    
  2. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 1.84
    1.8424169 = sum of:
      1.8424169 = product of:
        3.6848338 = sum of:
          3.6848338 = weight(author_txt:goharian in 2765) [ClassicSimilarity], result of:
            3.6848338 = score(doc=2765,freq=1.0), product of:
              0.7662811 = queryWeight, product of:
                1.092083 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.07295797 = queryNorm
              4.808723 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.5 = fieldNorm(doc=2765)
        0.5 = coord(1/2)
    
  3. Mengle, S.S.R.; Goharian, N.: Ambiguity measure feature-selection algorithm (2009) 1.84
    1.8424169 = sum of:
      1.8424169 = product of:
        3.6848338 = sum of:
          3.6848338 = weight(author_txt:goharian in 2804) [ClassicSimilarity], result of:
            3.6848338 = score(doc=2804,freq=1.0), product of:
              0.7662811 = queryWeight, product of:
                1.092083 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.07295797 = queryNorm
              4.808723 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.5 = fieldNorm(doc=2804)
        0.5 = coord(1/2)
    
  4. Mengle, S.S.R.; Goharian, N.: Detecting relationships among categories using text classification (2010) 1.84
    1.8424169 = sum of:
      1.8424169 = product of:
        3.6848338 = sum of:
          3.6848338 = weight(author_txt:goharian in 3462) [ClassicSimilarity], result of:
            3.6848338 = score(doc=3462,freq=1.0), product of:
              0.7662811 = queryWeight, product of:
                1.092083 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.07295797 = queryNorm
              4.808723 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.5 = fieldNorm(doc=3462)
        0.5 = coord(1/2)
    
  5. Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval (1992) 1.41
    1.4145588 = sum of:
      1.4145588 = product of:
        2.8291175 = sum of:
          2.8291175 = weight(author_txt:yates in 3082) [ClassicSimilarity], result of:
            2.8291175 = score(doc=3082,freq=1.0), product of:
              0.6425055 = queryWeight, product of:
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.07295797 = queryNorm
              4.403258 = fieldWeight in 3082, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.5 = fieldNorm(doc=3082)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Pluye, P.; Grad, R.; Repchinsky, C.; Jovaisas, B.; Johnson-Lafleur, J.; Carrier, M.-E.; Granikov, V.; Farrell, B.; Rodriguez, C.; Bartlett, G.; Loiselle, C.; Légaré, F.: Four levels of outcomes of information-seeking : a mixed methods study in primary health care (2013) 0.13
    0.13116491 = sum of:
      0.13116491 = product of:
        0.65582454 = sum of:
          0.018521192 = weight(abstract_txt:they in 534) [ClassicSimilarity], result of:
            0.018521192 = score(doc=534,freq=2.0), product of:
              0.05584784 = queryWeight, product of:
                1.0520912 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.014147687 = queryNorm
              0.33163667 = fieldWeight in 534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.0625 = fieldNorm(doc=534)
          0.022608086 = weight(abstract_txt:method in 534) [ClassicSimilarity], result of:
            0.022608086 = score(doc=534,freq=1.0), product of:
              0.08036734 = queryWeight, product of:
                1.2620891 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.014147687 = queryNorm
              0.28130937 = fieldWeight in 534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=534)
          0.034338005 = weight(abstract_txt:proposed in 534) [ClassicSimilarity], result of:
            0.034338005 = score(doc=534,freq=2.0), product of:
              0.08428374 = queryWeight, product of:
                1.2924749 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014147687 = queryNorm
              0.4074096 = fieldWeight in 534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=534)
          0.04864692 = weight(abstract_txt:medical in 534) [ClassicSimilarity], result of:
            0.04864692 = score(doc=534,freq=1.0), product of:
              0.13394935 = queryWeight, product of:
                1.6293731 = boost
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.014147687 = queryNorm
              0.36317396 = fieldWeight in 534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.0625 = fieldNorm(doc=534)
          0.5317103 = weight(abstract_txt:clinical in 534) [ClassicSimilarity], result of:
            0.5317103 = score(doc=534,freq=5.0), product of:
              0.5236131 = queryWeight, product of:
                5.0936074 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.014147687 = queryNorm
              1.0154642 = fieldWeight in 534, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=534)
        0.2 = coord(5/25)
    
  2. Grad, R.; Pluye, P.; Granikov, V.; Johnson-Lafleur, J.; Shulha, M.; Sridhar, S.B.; Moscovici, J.L.; Bartlett, G.; Vandal, A.C.; Marlow, B.; Kloda, L.: Physicians' assessment of the value of clinical information : Operationalization of a theoretical model (2011) 0.12
    0.1226448 = sum of:
      0.1226448 = product of:
        0.76653 = sum of:
          0.020810481 = weight(abstract_txt:support in 4763) [ClassicSimilarity], result of:
            0.020810481 = score(doc=4763,freq=1.0), product of:
              0.07604871 = queryWeight, product of:
                1.227711 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.014147687 = queryNorm
              0.27364674 = fieldWeight in 4763, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.0625 = fieldNorm(doc=4763)
          0.045216173 = weight(abstract_txt:method in 4763) [ClassicSimilarity], result of:
            0.045216173 = score(doc=4763,freq=4.0), product of:
              0.08036734 = queryWeight, product of:
                1.2620891 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.014147687 = queryNorm
              0.56261873 = fieldWeight in 4763, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=4763)
          0.027937036 = weight(abstract_txt:practice in 4763) [ClassicSimilarity], result of:
            0.027937036 = score(doc=4763,freq=1.0), product of:
              0.09254593 = queryWeight, product of:
                1.3543437 = boost
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.014147687 = queryNorm
              0.30187213 = fieldWeight in 4763, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.0625 = fieldNorm(doc=4763)
          0.6725663 = weight(abstract_txt:clinical in 4763) [ClassicSimilarity], result of:
            0.6725663 = score(doc=4763,freq=8.0), product of:
              0.5236131 = queryWeight, product of:
                5.0936074 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.014147687 = queryNorm
              1.2844719 = fieldWeight in 4763, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=4763)
        0.16 = coord(4/25)
    
  3. Cruz Díaz, N.P.; Maña López, M.J.; Mata Vázquez, J.; Pachón Álvarez, V.: ¬A machine-learning approach to negation and speculation detection in clinical texts (2012) 0.12
    0.11741171 = sum of:
      0.11741171 = product of:
        0.58705854 = sum of:
          0.04498355 = weight(abstract_txt:biomedical in 283) [ClassicSimilarity], result of:
            0.04498355 = score(doc=283,freq=1.0), product of:
              0.10090893 = queryWeight, product of:
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.014147687 = queryNorm
              0.44578367 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=283)
          0.024280636 = weight(abstract_txt:proposed in 283) [ClassicSimilarity], result of:
            0.024280636 = score(doc=283,freq=1.0), product of:
              0.08428374 = queryWeight, product of:
                1.2924749 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014147687 = queryNorm
              0.2880821 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=283)
          0.04864692 = weight(abstract_txt:medical in 283) [ClassicSimilarity], result of:
            0.04864692 = score(doc=283,freq=1.0), product of:
              0.13394935 = queryWeight, product of:
                1.6293731 = boost
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.014147687 = queryNorm
              0.36317396 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.0625 = fieldNorm(doc=283)
          0.057286337 = weight(abstract_txt:previously in 283) [ClassicSimilarity], result of:
            0.057286337 = score(doc=283,freq=1.0), product of:
              0.14937267 = queryWeight, product of:
                1.7206231 = boost
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.014147687 = queryNorm
              0.38351285 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.0625 = fieldNorm(doc=283)
          0.41186106 = weight(abstract_txt:clinical in 283) [ClassicSimilarity], result of:
            0.41186106 = score(doc=283,freq=3.0), product of:
              0.5236131 = queryWeight, product of:
                5.0936074 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.014147687 = queryNorm
              0.7865752 = fieldWeight in 283, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=283)
        0.2 = coord(5/25)
    
  4. Thelwall, M.; Maflahi, N.: Guideline references and academic citations as evidence of the clinical value of health research (2016) 0.11
    0.11286428 = sum of:
      0.11286428 = product of:
        0.5643214 = sum of:
          0.01309646 = weight(abstract_txt:they in 2856) [ClassicSimilarity], result of:
            0.01309646 = score(doc=2856,freq=1.0), product of:
              0.05584784 = queryWeight, product of:
                1.0520912 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.014147687 = queryNorm
              0.23450254 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.0625 = fieldNorm(doc=2856)
          0.027937036 = weight(abstract_txt:practice in 2856) [ClassicSimilarity], result of:
            0.027937036 = score(doc=2856,freq=1.0), product of:
              0.09254593 = queryWeight, product of:
                1.3543437 = boost
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.014147687 = queryNorm
              0.30187213 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.0625 = fieldNorm(doc=2856)
          0.06879713 = weight(abstract_txt:medical in 2856) [ClassicSimilarity], result of:
            0.06879713 = score(doc=2856,freq=2.0), product of:
              0.13394935 = queryWeight, product of:
                1.6293731 = boost
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.014147687 = queryNorm
              0.51360554 = fieldWeight in 2856, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.0625 = fieldNorm(doc=2856)
          0.042629737 = weight(abstract_txt:literature in 2856) [ClassicSimilarity], result of:
            0.042629737 = score(doc=2856,freq=1.0), product of:
              0.1545452 = queryWeight, product of:
                2.4751012 = boost
                4.413439 = idf(docFreq=1455, maxDocs=44218)
                0.014147687 = queryNorm
              0.27583992 = fieldWeight in 2856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.413439 = idf(docFreq=1455, maxDocs=44218)
                0.0625 = fieldNorm(doc=2856)
          0.41186106 = weight(abstract_txt:clinical in 2856) [ClassicSimilarity], result of:
            0.41186106 = score(doc=2856,freq=3.0), product of:
              0.5236131 = queryWeight, product of:
                5.0936074 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.014147687 = queryNorm
              0.7865752 = fieldWeight in 2856, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=2856)
        0.2 = coord(5/25)
    
  5. Cimino, J.J.: Distributed cognition and knowledge-based controlled medical terminologies (1998) 0.11
    0.1072494 = sum of:
      0.1072494 = product of:
        0.67030877 = sum of:
          0.032412086 = weight(abstract_txt:they in 3223) [ClassicSimilarity], result of:
            0.032412086 = score(doc=3223,freq=2.0), product of:
              0.05584784 = queryWeight, product of:
                1.0520912 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.014147687 = queryNorm
              0.58036417 = fieldWeight in 3223, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.109375 = fieldNorm(doc=3223)
          0.051503316 = weight(abstract_txt:support in 3223) [ClassicSimilarity], result of:
            0.051503316 = score(doc=3223,freq=2.0), product of:
              0.07604871 = queryWeight, product of:
                1.227711 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.014147687 = queryNorm
              0.67724115 = fieldWeight in 3223, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.109375 = fieldNorm(doc=3223)
          0.17026421 = weight(abstract_txt:medical in 3223) [ClassicSimilarity], result of:
            0.17026421 = score(doc=3223,freq=4.0), product of:
              0.13394935 = queryWeight, product of:
                1.6293731 = boost
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.014147687 = queryNorm
              1.2711089 = fieldWeight in 3223, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8107834 = idf(docFreq=359, maxDocs=44218)
                0.109375 = fieldNorm(doc=3223)
          0.41612917 = weight(abstract_txt:clinical in 3223) [ClassicSimilarity], result of:
            0.41612917 = score(doc=3223,freq=1.0), product of:
              0.5236131 = queryWeight, product of:
                5.0936074 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.014147687 = queryNorm
              0.79472643 = fieldWeight in 3223, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.109375 = fieldNorm(doc=3223)
        0.16 = coord(4/25)