Search (1 results, page 1 of 1)

  • × year_i:[2010 TO 2020}
  • × author_ss:"Kantor, P.B."
  1. Sun, Y.; Kantor, P.B.; Morse, E.L.: Using cross-evaluation to evaluate interactive QA systems (2011) 0.02
    0.020007102 = product of:
      0.040014204 = sum of:
        0.040014204 = product of:
          0.08002841 = sum of:
            0.08002841 = weight(_text_:systems in 4744) [ClassicSimilarity], result of:
              0.08002841 = score(doc=4744,freq=12.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.4990213 = fieldWeight in 4744, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4744)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, we report on an experiment to assess the possibility of rigorous evaluation of interactive question-answering (QA) systems using the cross-evaluation method. This method takes into account the effects of tasks and context, and of the users of the systems. Statistical techniques are used to remove these effects, isolating the effect of the system itself. The results show that this approach yields meaningful measurements of the impact of systems on user task performance, using a surprisingly small number of subjects and without relying on predetermined judgments of the quality, or of the relevance of materials. We conclude that the method is indeed effective for comparing end-to-end QA systems, and for comparing interactive systems with high efficiency.