Search (2 results, page 1 of 1)

  • × author_ss:"Sun, Y."
  • × author_ss:"Kantor, P.B."
  1. Sun, Y.; Kantor, P.B.; Morse, E.L.: Using cross-evaluation to evaluate interactive QA systems (2011) 0.02
    0.020007102 = product of:
      0.040014204 = sum of:
        0.040014204 = product of:
          0.08002841 = sum of:
            0.08002841 = weight(_text_:systems in 4744) [ClassicSimilarity], result of:
              0.08002841 = score(doc=4744,freq=12.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.4990213 = fieldWeight in 4744, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4744)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, we report on an experiment to assess the possibility of rigorous evaluation of interactive question-answering (QA) systems using the cross-evaluation method. This method takes into account the effects of tasks and context, and of the users of the systems. Statistical techniques are used to remove these effects, isolating the effect of the system itself. The results show that this approach yields meaningful measurements of the impact of systems on user task performance, using a surprisingly small number of subjects and without relying on predetermined judgments of the quality, or of the relevance of materials. We conclude that the method is indeed effective for comparing end-to-end QA systems, and for comparing interactive systems with high efficiency.
  2. Sun, Y.; Kantor, P.B.: Cross-evaluation : a new model for information system evaluation (2006) 0.01
    0.0068065543 = product of:
      0.013613109 = sum of:
        0.013613109 = product of:
          0.027226217 = sum of:
            0.027226217 = weight(_text_:systems in 5048) [ClassicSimilarity], result of:
              0.027226217 = score(doc=5048,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.1697705 = fieldWeight in 5048, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5048)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, we introduce a new information system evaluation method and report on its application to a collaborative information seeking system, AntWorld. The key innovation of the new method is to use precisely the same group of users who work with the system as judges, a system we call Cross-Evaluation. In the new method, we also propose to assess the system at the level of task completion. The obvious potential limitation of this method is that individuals may be inclined to think more highly of the materials that they themselves have found and are almost certain to think more highly of their own work product than they do of the products built by others. The keys to neutralizing this problem are careful design and a corresponding analytical model based on analysis of variance. We model the several measures of task completion with a linear model of five effects, describing the users who interact with the system, the system used to finish the task, the task itself, the behavior of individuals as judges, and the selfjudgment bias. Our analytical method successfully isolates the effect of each variable. This approach provides a successful model to make concrete the "threerealities" paradigm, which calls for "real tasks," "real users," and "real systems."