Search (3 results, page 1 of 1)

Did you mean:
author's%3a%22Gilliland-swetland%2c A.%22 3
author's%3a%22Gilliland-scotland%2c A.%22 3
authors%3a%22Gilliland-swetland%2c A.%22 3
author's%3a%22Gilliland-seland%2c A.%22 3
authors%3a%22Gilliland-scotland%2c A.%22 3

Sun, Y.; Kantor, P.B.: Cross-evaluation : a new model for information system evaluation (2006) 0.00
```
0.002149515 = product of:
  0.00429903 = sum of:
    0.00429903 = product of:
      0.00859806 = sum of:
        0.00859806 = weight(_text_:a in 5048) [ClassicSimilarity], result of:
          0.00859806 = score(doc=5048,freq=16.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.18016359 = fieldWeight in 5048, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5048)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article, we introduce a new information system evaluation method and report on its application to a collaborative information seeking system, AntWorld. The key innovation of the new method is to use precisely the same group of users who work with the system as judges, a system we call Cross-Evaluation. In the new method, we also propose to assess the system at the level of task completion. The obvious potential limitation of this method is that individuals may be inclined to think more highly of the materials that they themselves have found and are almost certain to think more highly of their own work product than they do of the products built by others. The keys to neutralizing this problem are careful design and a corresponding analytical model based on analysis of variance. We model the several measures of task completion with a linear model of five effects, describing the users who interact with the system, the system used to finish the task, the task itself, the behavior of individuals as judges, and the selfjudgment bias. Our analytical method successfully isolates the effect of each variable. This approach provides a successful model to make concrete the "threerealities" paradigm, which calls for "real tasks," "real users," and "real systems."

Type

a
Ng, K.B.; Kantor, P.B.; Strzalkowski, T.; Wacholder, N.; Tang, R.; Bai, B.; Rittman,; Song, P.; Sun, Y.: Automated judgment of document qualities (2006) 0.00
```
0.0020392092 = product of:
  0.0040784185 = sum of:
    0.0040784185 = product of:
      0.008156837 = sum of:
        0.008156837 = weight(_text_:a in 182) [ClassicSimilarity], result of:
          0.008156837 = score(doc=182,freq=10.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.1709182 = fieldWeight in 182, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=182)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The authors report on a series of experiments to automate the assessment of document qualities such as depth and objectivity. The primary purpose is to develop a quality-sensitive functionality, orthogonal to relevance, to select documents for an interactive question-answering system. The study consisted of two stages. In the classifier construction stage, nine document qualities deemed important by information professionals were identified and classifiers were developed to predict their values. In the confirmative evaluation stage, the performance of the developed methods was checked using a different document collection. The quality prediction methods worked well in the second stage. The results strongly suggest that the best way to predict document qualities automatically is to construct classifiers on a person-by-person basis.

Type

a
Sun, Y.; Kantor, P.B.; Morse, E.L.: Using cross-evaluation to evaluate interactive QA systems (2011) 0.00
```
0.001289709 = product of:
  0.002579418 = sum of:
    0.002579418 = product of:
      0.005158836 = sum of:
        0.005158836 = weight(_text_:a in 4744) [ClassicSimilarity], result of:
          0.005158836 = score(doc=4744,freq=4.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.10809815 = fieldWeight in 4744, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4744)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article, we report on an experiment to assess the possibility of rigorous evaluation of interactive question-answering (QA) systems using the cross-evaluation method. This method takes into account the effects of tasks and context, and of the users of the systems. Statistical techniques are used to remove these effects, isolating the effect of the system itself. The results show that this approach yields meaningful measurements of the impact of systems on user task performance, using a surprisingly small number of subjects and without relying on predetermined judgments of the quality, or of the relevance of materials. We conclude that the method is indeed effective for comparing end-to-end QA systems, and for comparing interactive systems with high efficiency.

Type

a

Search (3 results, page 1 of 1)

Authors

Years