Search (3 results, page 1 of 1)

Wacholder, N.; Kelly, D.; Kantor, P.; Rittman, R.; Sun, Y.; Bai, B.; Small, S.; Yamrom, B.; Strzalkowski, T.: ¬A model for quantitative evaluation of an end-to-end question-answering system (2007) 0.02
```
0.018938396 = product of:
  0.037876792 = sum of:
    0.037876792 = product of:
      0.075753585 = sum of:
        0.075753585 = weight(_text_:b in 435) [ClassicSimilarity], result of:
          0.075753585 = score(doc=435,freq=8.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.46973482 = fieldWeight in 435, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.046875 = fieldNorm(doc=435)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We describe a procedure for quantitative evaluation of interactive question-answering systems and illustrate it with application to the High-Quality Interactive QuestionAnswering (HITIQA) system. Our objectives were (a) to design a method to realistically and reliably assess interactive question-answering systems by comparing the quality of reports produced using different systems, (b) to conduct a pilot test of this method, and (c) to perform a formative evaluation of the HITIQA system. Far more important than the specific information gathered from this pilot evaluation is the development of (a) a protocol for evaluating an emerging technology, (b) reusable assessment instruments, and (c) the knowledge gained in conducting the evaluation. We conclude that this method, which uses a surprisingly small number of subjects and does not rely on predetermined relevance judgments, measures the impact of system change on work produced by users. Therefore this method can be used to compare the product of interactive systems that use different underlying technologies.

Ng, K.B.; Kantor, P.B.; Strzalkowski, T.; Wacholder, N.; Tang, R.; Bai, B.; Rittman,; Song, P.; Sun, Y.: Automated judgment of document qualities (2006) 0.01

0.009469198 = product of:
  0.018938396 = sum of:
    0.018938396 = product of:
      0.037876792 = sum of:
        0.037876792 = weight(_text_:b in 182) [ClassicSimilarity], result of:
          0.037876792 = score(doc=182,freq=2.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.23486741 = fieldWeight in 182, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.046875 = fieldNorm(doc=182)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Wacholder, N.; Liu, L.: User preference : a measure of query-term quality (2006) 0.01
```
0.007890998 = product of:
  0.015781997 = sum of:
    0.015781997 = product of:
      0.031563994 = sum of:
        0.031563994 = weight(_text_:b in 19) [ClassicSimilarity], result of:
          0.031563994 = score(doc=19,freq=2.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.19572285 = fieldWeight in 19, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.0390625 = fieldNorm(doc=19)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The goal of this research is to understand what characteristics, if any, lead users engaged in interactive information seeking to prefer certain sets of query terms. Underlying this work is the assumption that query terms that information seekers prefer induce a kind of cognitive efficiency: They require less mental effort to process and therefore reduce the energy required in the interactive information-seeking process. Conceptually, this work applies insights from linguistics and cognitive science to the study of query-term quality. We report on an experiment in which we compare user preference for three sets of terms; one had been preconstructed by a human indexer, and two were identified automatically. Twenty-four participants used a merged list of all terms to answer a carefully created set of questions. By design, the interface constrained users to access the text exclusively via the displayed list of query terms. We found that participants displayed a preference for the human-constructed set of terms eight times greater than the preference for either set of automatically identified terms. We speculate about reasons for this strong preference and discuss the implications for information access. The primary contributions of this research are (a) explication of the concept of user preference as a measure of queryterm quality and (b) identification of a replicable procedure for measuring preference for sets of query terms created by different methods, whether human or automatic. All other factors being equal, query terms that users prefer clearly are the best choice for real-world information-access systems.

Search (3 results, page 1 of 1)

Authors