Document (#40923)

Author
An, X.
Huang, J.X.
Title
geNov : a new metric for measuring novelty and relevancy in biomedical information retrieval
Source
Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2620-2635
Year
2017
Abstract
For diversity and novelty evaluation in information retrieval, we expect that the novel documents are always ranked higher than the redundant ones and the relevant ones higher than the irrelevant ones. We also expect that the level of novelty and relevancy should be acknowledged. Accordingly, we expect that the evaluation algorithm would reward rankings that respect these expectations. Nevertheless, there are few research articles in the literature that study how to meet such expectations, even fewer in the field of biomedical information retrieval. In this article, we propose a new metric for novelty and relevancy evaluation in biomedical information retrieval based on an aspect-level performance measure introduced by TREC Genomics Track with formal results to show that those expectations above can be respected under ideal conditions. The empirical evaluation indicates that the proposed metric, geNov, is greatly sensitive to the desired characteristics above, and the three parameters are highly tuneable for different evaluation preferences. By experimentally comparing with state-of-the-art metrics for novelty and diversity, the proposed metric shows its advantages in recognizing the ranking quality in terms of novelty, redundancy, relevancy, and irrelevancy and in its discriminative power. Experiments reveal the proposed metric is faster to compute than state-of-the-art metrics.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23958/full.
Footnote
Beitrag in einem Special issue on biomedical information retrieval.
Field
Medizin

Similar documents (author)

  1. Huang, G.W.: Accessing information in an information society (1989) 4.58
    4.5762196 = sum of:
      4.5762196 = weight(author_txt:huang in 2566) [ClassicSimilarity], result of:
        4.5762196 = fieldWeight in 2566, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.321951 = idf(docFreq=75, maxDocs=42306)
          0.625 = fieldNorm(doc=2566)
    
  2. Huang, X.: Applying a generic function-based topical relevance typology to structure clinical questions and answers (2013) 4.58
    4.5762196 = sum of:
      4.5762196 = weight(author_txt:huang in 2531) [ClassicSimilarity], result of:
        4.5762196 = fieldWeight in 2531, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.321951 = idf(docFreq=75, maxDocs=42306)
          0.625 = fieldNorm(doc=2531)
    
  3. Huang, J. Xiangji => Xiangji Huang, J.: 3.88
    3.883051 = sum of:
      3.883051 = weight(author_txt:huang in 235) [ClassicSimilarity], result of:
        3.883051 = fieldWeight in 235, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.321951 = idf(docFreq=75, maxDocs=42306)
          0.375 = fieldNorm(doc=235)
    
  4. Huang, M.-H.: Developing an ideal online thesaurus display format (1994) 3.66
    3.6609755 = sum of:
      3.6609755 = weight(author_txt:huang in 4099) [ClassicSimilarity], result of:
        3.6609755 = fieldWeight in 4099, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.321951 = idf(docFreq=75, maxDocs=42306)
          0.5 = fieldNorm(doc=4099)
    
  5. Huang, M.-h.: End-users' searching behaviour : changes in search type over time (1996) 3.66
    3.6609755 = sum of:
      3.6609755 = weight(author_txt:huang in 5197) [ClassicSimilarity], result of:
        3.6609755 = fieldWeight in 5197, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.321951 = idf(docFreq=75, maxDocs=42306)
          0.5 = fieldNorm(doc=5197)
    

Similar documents (content)

  1. Xiaoyan Li, X.; Croft, W.B.: ¬An information-pattern-based approach to novelty detection (2008) 0.21
    0.20532298 = sum of:
      0.20532298 = product of:
        1.0266149 = sum of:
          0.037689112 = weight(abstract_txt:level in 4081) [ClassicSimilarity], result of:
            0.037689112 = score(doc=4081,freq=3.0), product of:
              0.061366465 = queryWeight, product of:
                1.1249561 = boost
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.012018806 = queryNorm
              0.61416465 = fieldWeight in 4081, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.078125 = fieldNorm(doc=4081)
          0.016478512 = weight(abstract_txt:information in 4081) [ClassicSimilarity], result of:
            0.016478512 = score(doc=4081,freq=6.0), product of:
              0.035350792 = queryWeight, product of:
                1.207493 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.012018806 = queryNorm
              0.46614265 = fieldWeight in 4081, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.078125 = fieldNorm(doc=4081)
          0.06121061 = weight(abstract_txt:proposed in 4081) [ClassicSimilarity], result of:
            0.06121061 = score(doc=4081,freq=3.0), product of:
              0.09705891 = queryWeight, product of:
                1.7327398 = boost
                4.660588 = idf(docFreq=1087, maxDocs=42306)
                0.012018806 = queryNorm
              0.6306542 = fieldWeight in 4081, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.660588 = idf(docFreq=1087, maxDocs=42306)
                0.078125 = fieldNorm(doc=4081)
          0.011328901 = weight(abstract_txt:that in 4081) [ClassicSimilarity], result of:
            0.011328901 = score(doc=4081,freq=1.0), product of:
              0.06029881 = queryWeight, product of:
                2.086212 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.012018806 = queryNorm
              0.18787934 = fieldWeight in 4081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.078125 = fieldNorm(doc=4081)
          0.8999077 = weight(abstract_txt:novelty in 4081) [ClassicSimilarity], result of:
            0.8999077 = score(doc=4081,freq=7.0), product of:
              0.5533084 = queryWeight, product of:
                5.850787 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.012018806 = queryNorm
              1.6264124 = fieldWeight in 4081, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.078125 = fieldNorm(doc=4081)
        0.2 = coord(5/25)
    
  2. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 0.20
    0.20330407 = sum of:
      0.20330407 = product of:
        0.72608596 = sum of:
          0.03015129 = weight(abstract_txt:level in 30) [ClassicSimilarity], result of:
            0.03015129 = score(doc=30,freq=3.0), product of:
              0.061366465 = queryWeight, product of:
                1.1249561 = boost
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.012018806 = queryNorm
              0.49133173 = fieldWeight in 30, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.009321654 = weight(abstract_txt:information in 30) [ClassicSimilarity], result of:
            0.009321654 = score(doc=30,freq=3.0), product of:
              0.035350792 = queryWeight, product of:
                1.207493 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.012018806 = queryNorm
              0.2636901 = fieldWeight in 30, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.03020906 = weight(abstract_txt:higher in 30) [ClassicSimilarity], result of:
            0.03020906 = score(doc=30,freq=1.0), product of:
              0.08861877 = queryWeight, product of:
                1.3518636 = boost
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.012018806 = queryNorm
              0.34088784 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.016798442 = weight(abstract_txt:than in 30) [ClassicSimilarity], result of:
            0.016798442 = score(doc=30,freq=1.0), product of:
              0.068597876 = queryWeight, product of:
                1.456703 = boost
                3.9181254 = idf(docFreq=2285, maxDocs=42306)
                0.012018806 = queryNorm
              0.24488284 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9181254 = idf(docFreq=2285, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.0154590355 = weight(abstract_txt:retrieval in 30) [ClassicSimilarity], result of:
            0.0154590355 = score(doc=30,freq=1.0), product of:
              0.071433045 = queryWeight, product of:
                1.7164636 = boost
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.012018806 = queryNorm
              0.21641295 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.015697785 = weight(abstract_txt:that in 30) [ClassicSimilarity], result of:
            0.015697785 = score(doc=30,freq=3.0), product of:
              0.06029881 = queryWeight, product of:
                2.086212 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.012018806 = queryNorm
              0.26033324 = fieldWeight in 30, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
          0.6084487 = weight(abstract_txt:novelty in 30) [ClassicSimilarity], result of:
            0.6084487 = score(doc=30,freq=5.0), product of:
              0.5533084 = queryWeight, product of:
                5.850787 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.012018806 = queryNorm
              1.0996555 = fieldWeight in 30, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.0625 = fieldNorm(doc=30)
        0.28 = coord(7/25)
    
  3. Xu, Y.; Yin, H.: Novelty and topicality in interactive information retrieval (2008) 0.19
    0.1944647 = sum of:
      0.1944647 = product of:
        0.8102696 = sum of:
          0.0076110987 = weight(abstract_txt:information in 3356) [ClassicSimilarity], result of:
            0.0076110987 = score(doc=3356,freq=2.0), product of:
              0.035350792 = queryWeight, product of:
                1.207493 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.012018806 = queryNorm
              0.21530207 = fieldWeight in 3356, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
          0.03020906 = weight(abstract_txt:higher in 3356) [ClassicSimilarity], result of:
            0.03020906 = score(doc=3356,freq=1.0), product of:
              0.08861877 = queryWeight, product of:
                1.3518636 = boost
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.012018806 = queryNorm
              0.34088784 = fieldWeight in 3356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
          0.016798442 = weight(abstract_txt:than in 3356) [ClassicSimilarity], result of:
            0.016798442 = score(doc=3356,freq=1.0), product of:
              0.068597876 = queryWeight, product of:
                1.456703 = boost
                3.9181254 = idf(docFreq=2285, maxDocs=42306)
                0.012018806 = queryNorm
              0.24488284 = fieldWeight in 3356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9181254 = idf(docFreq=2285, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
          0.0154590355 = weight(abstract_txt:retrieval in 3356) [ClassicSimilarity], result of:
            0.0154590355 = score(doc=3356,freq=1.0), product of:
              0.071433045 = queryWeight, product of:
                1.7164636 = boost
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.012018806 = queryNorm
              0.21641295 = fieldWeight in 3356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
          0.020265754 = weight(abstract_txt:that in 3356) [ClassicSimilarity], result of:
            0.020265754 = score(doc=3356,freq=5.0), product of:
              0.06029881 = queryWeight, product of:
                2.086212 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.012018806 = queryNorm
              0.33608878 = fieldWeight in 3356, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
          0.7199262 = weight(abstract_txt:novelty in 3356) [ClassicSimilarity], result of:
            0.7199262 = score(doc=3356,freq=7.0), product of:
              0.5533084 = queryWeight, product of:
                5.850787 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.012018806 = queryNorm
              1.3011299 = fieldWeight in 3356, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.0625 = fieldNorm(doc=3356)
        0.24 = coord(6/25)
    
  4. MacCall, S.L.; Cleveland, A.D.: ¬A relevance-based quantitative measure for Internet information retrieval evaluation (1999) 0.18
    0.176896 = sum of:
      0.176896 = product of:
        0.63177145 = sum of:
          0.011652068 = weight(abstract_txt:information in 690) [ClassicSimilarity], result of:
            0.011652068 = score(doc=690,freq=3.0), product of:
              0.035350792 = queryWeight, product of:
                1.207493 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.012018806 = queryNorm
              0.32961264 = fieldWeight in 690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.07056893 = weight(abstract_txt:metrics in 690) [ClassicSimilarity], result of:
            0.07056893 = score(doc=690,freq=1.0), product of:
              0.13445282 = queryWeight, product of:
                1.6651562 = boost
                6.71821 = idf(docFreq=138, maxDocs=42306)
                0.012018806 = queryNorm
              0.52486014 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.71821 = idf(docFreq=138, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.03864759 = weight(abstract_txt:retrieval in 690) [ClassicSimilarity], result of:
            0.03864759 = score(doc=690,freq=4.0), product of:
              0.071433045 = queryWeight, product of:
                1.7164636 = boost
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.012018806 = queryNorm
              0.5410324 = fieldWeight in 690, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4626071 = idf(docFreq=3604, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.035339966 = weight(abstract_txt:proposed in 690) [ClassicSimilarity], result of:
            0.035339966 = score(doc=690,freq=1.0), product of:
              0.09705891 = queryWeight, product of:
                1.7327398 = boost
                4.660588 = idf(docFreq=1087, maxDocs=42306)
                0.012018806 = queryNorm
              0.3641084 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.660588 = idf(docFreq=1087, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.011328901 = weight(abstract_txt:that in 690) [ClassicSimilarity], result of:
            0.011328901 = score(doc=690,freq=1.0), product of:
              0.06029881 = queryWeight, product of:
                2.086212 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.012018806 = queryNorm
              0.18787934 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.052785646 = weight(abstract_txt:evaluation in 690) [ClassicSimilarity], result of:
            0.052785646 = score(doc=690,freq=1.0), product of:
              0.15036662 = queryWeight, product of:
                2.7842982 = boost
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.012018806 = queryNorm
              0.3510463 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
          0.41144833 = weight(abstract_txt:metric in 690) [ClassicSimilarity], result of:
            0.41144833 = score(doc=690,freq=3.0), product of:
              0.409868 = queryWeight, product of:
                4.596868 = boost
                7.4185777 = idf(docFreq=68, maxDocs=42306)
                0.012018806 = queryNorm
              1.0038557 = fieldWeight in 690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.4185777 = idf(docFreq=68, maxDocs=42306)
                0.078125 = fieldNorm(doc=690)
        0.28 = coord(7/25)
    
  5. Liu, J.S.; Chen, H.-H.; Ho, M.H.-C.; Li, Y.-C.: Citations with different levels of relevancy : tracing the main paths of legal opinions (2014) 0.15
    0.15059274 = sum of:
      0.15059274 = product of:
        0.7529637 = sum of:
          0.034815714 = weight(abstract_txt:level in 3547) [ClassicSimilarity], result of:
            0.034815714 = score(doc=3547,freq=4.0), product of:
              0.061366465 = queryWeight, product of:
                1.1249561 = boost
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.012018806 = queryNorm
              0.567341 = fieldWeight in 3547, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.538728 = idf(docFreq=1228, maxDocs=42306)
                0.0625 = fieldNorm(doc=3547)
          0.012034205 = weight(abstract_txt:information in 3547) [ClassicSimilarity], result of:
            0.012034205 = score(doc=3547,freq=5.0), product of:
              0.035350792 = queryWeight, product of:
                1.207493 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.012018806 = queryNorm
              0.34042248 = fieldWeight in 3547, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=3547)
          0.03020906 = weight(abstract_txt:higher in 3547) [ClassicSimilarity], result of:
            0.03020906 = score(doc=3547,freq=1.0), product of:
              0.08861877 = queryWeight, product of:
                1.3518636 = boost
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.012018806 = queryNorm
              0.34088784 = fieldWeight in 3547, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4542055 = idf(docFreq=491, maxDocs=42306)
                0.0625 = fieldNorm(doc=3547)
          0.02220002 = weight(abstract_txt:that in 3547) [ClassicSimilarity], result of:
            0.02220002 = score(doc=3547,freq=6.0), product of:
              0.06029881 = queryWeight, product of:
                2.086212 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.012018806 = queryNorm
              0.3681668 = fieldWeight in 3547, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=3547)
          0.6537047 = weight(abstract_txt:relevancy in 3547) [ClassicSimilarity], result of:
            0.6537047 = score(doc=3547,freq=10.0), product of:
              0.4024377 = queryWeight, product of:
                4.074125 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.012018806 = queryNorm
              1.6243626 = fieldWeight in 3547, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.0625 = fieldNorm(doc=3547)
        0.2 = coord(5/25)