Search (2 results, page 1 of 1)

Baillie, M.; Azzopardi, L.; Ruthven, I.: Evaluating epistemic uncertainty under incomplete assessments (2008) 0.00
```
0.002269176 = product of:
  0.004538352 = sum of:
    0.004538352 = product of:
      0.009076704 = sum of:
        0.009076704 = weight(_text_:a in 2065) [ClassicSimilarity], result of:
          0.009076704 = score(doc=2065,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.1709182 = fieldWeight in 2065, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2065)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The thesis of this study is to propose an extended methodology for laboratory based Information Retrieval evaluation under incomplete relevance assessments. This new methodology aims to identify potential uncertainty during system comparison that may result from incompleteness. The adoption of this methodology is advantageous, because the detection of epistemic uncertainty - the amount of knowledge (or ignorance) we have about the estimate of a system's performance - during the evaluation process can guide and direct researchers when evaluating new systems over existing and future test collections. Across a series of experiments we demonstrate how this methodology can lead towards a finer grained analysis of systems. In particular, we show through experimentation how the current practice in Information Retrieval evaluation of using a measurement depth larger than the pooling depth increases uncertainty during system comparison.

Type

a
Ruthven, I.; Baillie, M.; Elsweiler, D.: ¬The relative effects of knowledge, interest and confidence in assessing relevance (2007) 0.00
```
0.0018909799 = product of:
  0.0037819599 = sum of:
    0.0037819599 = product of:
      0.0075639198 = sum of:
        0.0075639198 = weight(_text_:a in 835) [ClassicSimilarity], result of:
          0.0075639198 = score(doc=835,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.14243183 = fieldWeight in 835, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=835)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - The purpose of this paper is to examine how different aspects of an assessor's context, in particular their knowledge of a search topic, their interest in the search topic and their confidence in assessing relevance for a topic, affect the relevance judgements made and the assessor's ability to predict which documents they will assess as being relevant. Design/methodology/approach - The study was conducted as part of the Text REtrieval Conference (TREC) HARD track. Using a specially constructed questionnaire information was sought on TREC assessors' personal context and, using the TREC assessments gathered, the responses were correlated to the questionnaire questions and the final relevance decisions. Findings - This study found that each of the three factors (interest, knowledge and confidence) had an affect on how many documents were assessed as relevant and the balance between how many documents were marked as marginally or highly relevant. Also these factors are shown to affect an assessors' ability to predict what information they will finally mark as being relevant. Research limitations/implications - The major limitation is that the research is conducted within the TREC initiative. This means that we can report on results but cannot report on discussions with the assessors. The research implications are numerous but mainly on the effect of personal context on the outcomes of a user study. Practical implications - One major consequence is that we should take more account of how we construct search tasks for IIR evaluation to create tasks that are interesting and relevant to experimental subjects. Originality/value - Examining different search variables within one study to compare the relative effects on these variables on the search outcomes.

Type

a

Search (2 results, page 1 of 1)

Authors