Document (#40922)

Author
An, X.
Huang, J.X.
Title
geNov : a new metric for measuring novelty and relevancy in biomedical information retrieval
Source
Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2620-2635
Year
2017
Abstract
For diversity and novelty evaluation in information retrieval, we expect that the novel documents are always ranked higher than the redundant ones and the relevant ones higher than the irrelevant ones. We also expect that the level of novelty and relevancy should be acknowledged. Accordingly, we expect that the evaluation algorithm would reward rankings that respect these expectations. Nevertheless, there are few research articles in the literature that study how to meet such expectations, even fewer in the field of biomedical information retrieval. In this article, we propose a new metric for novelty and relevancy evaluation in biomedical information retrieval based on an aspect-level performance measure introduced by TREC Genomics Track with formal results to show that those expectations above can be respected under ideal conditions. The empirical evaluation indicates that the proposed metric, geNov, is greatly sensitive to the desired characteristics above, and the three parameters are highly tuneable for different evaluation preferences. By experimentally comparing with state-of-the-art metrics for novelty and diversity, the proposed metric shows its advantages in recognizing the ranking quality in terms of novelty, redundancy, relevancy, and irrelevancy and in its discriminative power. Experiments reveal the proposed metric is faster to compute than state-of-the-art metrics.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23958/full.
Footnote
Beitrag in einem Special issue on biomedical information retrieval.
Field
Medizin

Similar documents (author)

  1. Huang, G.W.: Accessing information in an information society (1989) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:huang in 2497) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 2497, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=2497)
    
  2. Huang, X.: Applying a generic function-based topical relevance typology to structure clinical questions and answers (2013) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:huang in 530) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 530, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=530)
    
  3. Huang, J. Xiangji => Xiangji Huang, J.: 3.82
    3.8168268 = sum of:
      3.8168268 = weight(author_txt:huang in 8235) [ClassicSimilarity], result of:
        3.8168268 = fieldWeight in 8235, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.375 = fieldNorm(doc=8235)
    
  4. Huang, M.-H.: Developing an ideal online thesaurus display format (1994) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:huang in 4030) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 4030, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=4030)
    
  5. Huang, M.-h.: End-users' searching behaviour : changes in search type over time (1996) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:huang in 5128) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 5128, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=5128)
    

Similar documents (content)

  1. Gao, R.; Ge, Y.; Sha, C.: FAIR: Fairness-aware information retrieval evaluation (2022) 0.26
    0.25679165 = sum of:
      0.25679165 = product of:
        0.91711307 = sum of:
          0.009356974 = weight(abstract_txt:information in 669) [ClassicSimilarity], result of:
            0.009356974 = score(doc=669,freq=3.0), product of:
              0.035703402 = queryWeight, product of:
                1.2092627 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012195617 = queryNorm
              0.26207513 = fieldWeight in 669, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.050585426 = weight(abstract_txt:diversity in 669) [ClassicSimilarity], result of:
            0.050585426 = score(doc=669,freq=1.0), product of:
              0.12589255 = queryWeight, product of:
                1.60565 = boost
                6.429029 = idf(docFreq=193, maxDocs=44218)
                0.012195617 = queryNorm
              0.4018143 = fieldWeight in 669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.429029 = idf(docFreq=193, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.11085709 = weight(abstract_txt:metrics in 669) [ClassicSimilarity], result of:
            0.11085709 = score(doc=669,freq=4.0), product of:
              0.13380492 = queryWeight, product of:
                1.6553388 = boost
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.012195617 = queryNorm
              0.8284979 = fieldWeight in 669, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.015978498 = weight(abstract_txt:retrieval in 669) [ClassicSimilarity], result of:
            0.015978498 = score(doc=669,freq=1.0), product of:
              0.073567115 = queryWeight, product of:
                1.7358321 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012195617 = queryNorm
              0.21719621 = fieldWeight in 669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.01253513 = weight(abstract_txt:that in 669) [ClassicSimilarity], result of:
            0.01253513 = score(doc=669,freq=2.0), product of:
              0.059852414 = queryWeight, product of:
                2.0712175 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012195617 = queryNorm
              0.20943399 = fieldWeight in 669, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.4562713 = weight(abstract_txt:metric in 669) [ClassicSimilarity], result of:
            0.4562713 = score(doc=669,freq=6.0), product of:
              0.4074379 = queryWeight, product of:
                4.567216 = boost
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.012195617 = queryNorm
              1.1198548 = fieldWeight in 669, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
          0.2615287 = weight(abstract_txt:novelty in 669) [ClassicSimilarity], result of:
            0.2615287 = score(doc=669,freq=1.0), product of:
              0.54287905 = queryWeight, product of:
                5.775147 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.012195617 = queryNorm
              0.48174396 = fieldWeight in 669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=669)
        0.28 = coord(7/25)
    
  2. Xiaoyan Li, X.; Croft, W.B.: ¬An information-pattern-based approach to novelty detection (2008) 0.20
    0.19811885 = sum of:
      0.19811885 = product of:
        0.99059427 = sum of:
          0.03750641 = weight(abstract_txt:level in 2080) [ClassicSimilarity], result of:
            0.03750641 = score(doc=2080,freq=3.0), product of:
              0.061622545 = queryWeight, product of:
                1.1233644 = boost
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.012195617 = queryNorm
              0.6086475 = fieldWeight in 2080, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.078125 = fieldNorm(doc=2080)
          0.01654095 = weight(abstract_txt:information in 2080) [ClassicSimilarity], result of:
            0.01654095 = score(doc=2080,freq=6.0), product of:
              0.035703402 = queryWeight, product of:
                1.2092627 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012195617 = queryNorm
              0.46328777 = fieldWeight in 2080, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=2080)
          0.060542434 = weight(abstract_txt:proposed in 2080) [ClassicSimilarity], result of:
            0.060542434 = score(doc=2080,freq=3.0), product of:
              0.09706731 = queryWeight, product of:
                1.726764 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.012195617 = queryNorm
              0.623716 = fieldWeight in 2080, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.078125 = fieldNorm(doc=2080)
          0.0110795945 = weight(abstract_txt:that in 2080) [ClassicSimilarity], result of:
            0.0110795945 = score(doc=2080,freq=1.0), product of:
              0.059852414 = queryWeight, product of:
                2.0712175 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012195617 = queryNorm
              0.18511525 = fieldWeight in 2080, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2080)
          0.86492485 = weight(abstract_txt:novelty in 2080) [ClassicSimilarity], result of:
            0.86492485 = score(doc=2080,freq=7.0), product of:
              0.54287905 = queryWeight, product of:
                5.775147 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.012195617 = queryNorm
              1.5932183 = fieldWeight in 2080, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.078125 = fieldNorm(doc=2080)
        0.2 = coord(5/25)
    
  3. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 0.20
    0.19657753 = sum of:
      0.19657753 = product of:
        0.7020626 = sum of:
          0.030005127 = weight(abstract_txt:level in 2210) [ClassicSimilarity], result of:
            0.030005127 = score(doc=2210,freq=3.0), product of:
              0.061622545 = queryWeight, product of:
                1.1233644 = boost
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.012195617 = queryNorm
              0.486918 = fieldWeight in 2210, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.009356974 = weight(abstract_txt:information in 2210) [ClassicSimilarity], result of:
            0.009356974 = score(doc=2210,freq=3.0), product of:
              0.035703402 = queryWeight, product of:
                1.2092627 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012195617 = queryNorm
              0.26207513 = fieldWeight in 2210, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.029699188 = weight(abstract_txt:higher in 2210) [ClassicSimilarity], result of:
            0.029699188 = score(doc=2210,freq=1.0), product of:
              0.088269934 = queryWeight, product of:
                1.344489 = boost
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.012195617 = queryNorm
              0.3364587 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.01687454 = weight(abstract_txt:than in 2210) [ClassicSimilarity], result of:
            0.01687454 = score(doc=2210,freq=1.0), product of:
              0.06931621 = queryWeight, product of:
                1.4591969 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.012195617 = queryNorm
              0.24344292 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.015978498 = weight(abstract_txt:retrieval in 2210) [ClassicSimilarity], result of:
            0.015978498 = score(doc=2210,freq=1.0), product of:
              0.073567115 = queryWeight, product of:
                1.7358321 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012195617 = queryNorm
              0.21719621 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.015352336 = weight(abstract_txt:that in 2210) [ClassicSimilarity], result of:
            0.015352336 = score(doc=2210,freq=3.0), product of:
              0.059852414 = queryWeight, product of:
                2.0712175 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012195617 = queryNorm
              0.2565032 = fieldWeight in 2210, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.58479595 = weight(abstract_txt:novelty in 2210) [ClassicSimilarity], result of:
            0.58479595 = score(doc=2210,freq=5.0), product of:
              0.54287905 = queryWeight, product of:
                5.775147 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.012195617 = queryNorm
              1.0772122 = fieldWeight in 2210, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
        0.28 = coord(7/25)
    
  4. Xu, Y.; Yin, H.: Novelty and topicality in interactive information retrieval (2008) 0.19
    0.18766844 = sum of:
      0.18766844 = product of:
        0.78195184 = sum of:
          0.007639937 = weight(abstract_txt:information in 1355) [ClassicSimilarity], result of:
            0.007639937 = score(doc=1355,freq=2.0), product of:
              0.035703402 = queryWeight, product of:
                1.2092627 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012195617 = queryNorm
              0.21398345 = fieldWeight in 1355, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
          0.029699188 = weight(abstract_txt:higher in 1355) [ClassicSimilarity], result of:
            0.029699188 = score(doc=1355,freq=1.0), product of:
              0.088269934 = queryWeight, product of:
                1.344489 = boost
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.012195617 = queryNorm
              0.3364587 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
          0.01687454 = weight(abstract_txt:than in 1355) [ClassicSimilarity], result of:
            0.01687454 = score(doc=1355,freq=1.0), product of:
              0.06931621 = queryWeight, product of:
                1.4591969 = boost
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.012195617 = queryNorm
              0.24344292 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8950868 = idf(docFreq=2444, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
          0.015978498 = weight(abstract_txt:retrieval in 1355) [ClassicSimilarity], result of:
            0.015978498 = score(doc=1355,freq=1.0), product of:
              0.073567115 = queryWeight, product of:
                1.7358321 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012195617 = queryNorm
              0.21719621 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
          0.019819781 = weight(abstract_txt:that in 1355) [ClassicSimilarity], result of:
            0.019819781 = score(doc=1355,freq=5.0), product of:
              0.059852414 = queryWeight, product of:
                2.0712175 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012195617 = queryNorm
              0.3311442 = fieldWeight in 1355, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
          0.6919399 = weight(abstract_txt:novelty in 1355) [ClassicSimilarity], result of:
            0.6919399 = score(doc=1355,freq=7.0), product of:
              0.54287905 = queryWeight, product of:
                5.775147 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.012195617 = queryNorm
              1.2745746 = fieldWeight in 1355, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=1355)
        0.24 = coord(6/25)
    
  5. MacCall, S.L.; Cleveland, A.D.: ¬A relevance-based quantitative measure for Internet information retrieval evaluation (1999) 0.17
    0.17470889 = sum of:
      0.17470889 = product of:
        0.6239603 = sum of:
          0.011696218 = weight(abstract_txt:information in 6689) [ClassicSimilarity], result of:
            0.011696218 = score(doc=6689,freq=3.0), product of:
              0.035703402 = queryWeight, product of:
                1.2092627 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012195617 = queryNorm
              0.32759392 = fieldWeight in 6689, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.06928568 = weight(abstract_txt:metrics in 6689) [ClassicSimilarity], result of:
            0.06928568 = score(doc=6689,freq=1.0), product of:
              0.13380492 = queryWeight, product of:
                1.6553388 = boost
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.012195617 = queryNorm
              0.5178112 = fieldWeight in 6689, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.034954194 = weight(abstract_txt:proposed in 6689) [ClassicSimilarity], result of:
            0.034954194 = score(doc=6689,freq=1.0), product of:
              0.09706731 = queryWeight, product of:
                1.726764 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.012195617 = queryNorm
              0.36010262 = fieldWeight in 6689, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.039946243 = weight(abstract_txt:retrieval in 6689) [ClassicSimilarity], result of:
            0.039946243 = score(doc=6689,freq=4.0), product of:
              0.073567115 = queryWeight, product of:
                1.7358321 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.012195617 = queryNorm
              0.5429905 = fieldWeight in 6689, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.0110795945 = weight(abstract_txt:that in 6689) [ClassicSimilarity], result of:
            0.0110795945 = score(doc=6689,freq=1.0), product of:
              0.059852414 = queryWeight, product of:
                2.0712175 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.012195617 = queryNorm
              0.18511525 = fieldWeight in 6689, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.053707764 = weight(abstract_txt:evaluation in 6689) [ClassicSimilarity], result of:
            0.053707764 = score(doc=6689,freq=1.0), product of:
              0.15324317 = queryWeight, product of:
                2.8009892 = boost
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.012195617 = queryNorm
              0.35047412 = fieldWeight in 6689, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
          0.4032906 = weight(abstract_txt:metric in 6689) [ClassicSimilarity], result of:
            0.4032906 = score(doc=6689,freq=3.0), product of:
              0.4074379 = queryWeight, product of:
                4.567216 = boost
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.012195617 = queryNorm
              0.9898211 = fieldWeight in 6689, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.078125 = fieldNorm(doc=6689)
        0.28 = coord(7/25)