Search (11 results, page 1 of 1)

Ravana, S.D.; Taheri, M.S.; Rajagopal, P.: Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems (2015) 0.03
```
0.03297302 = product of:
  0.06594604 = sum of:
    0.06594604 = sum of:
      0.030652853 = weight(_text_:web in 2587) [ClassicSimilarity], result of:
        0.030652853 = score(doc=2587,freq=2.0), product of:
          0.17002425 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.052098576 = queryNorm
          0.18028519 = fieldWeight in 2587, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2587)
      0.03529319 = weight(_text_:22 in 2587) [ClassicSimilarity], result of:
        0.03529319 = score(doc=2587,freq=2.0), product of:
          0.18244034 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052098576 = queryNorm
          0.19345059 = fieldWeight in 2587, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2587)
  0.5 = coord(1/2)
```
Abstract

Purpose The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries. Design/methodology/approach Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document's weight, which play the role of the mean average precision (MAP) score of the systems as a significance test's statics. The experiments were conducted using the TREC 9 Web track collection. Findings The p-values generated through the two types of significance tests, namely the Student's t-test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores. Originality/value Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.

Date

20. 1.2015 18:30:22
Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.01
```
0.013273074 = product of:
  0.026546149 = sum of:
    0.026546149 = product of:
      0.053092297 = sum of:
        0.053092297 = weight(_text_:web in 3144) [ClassicSimilarity], result of:
          0.053092297 = score(doc=3144,freq=6.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.3122631 = fieldWeight in 3144, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3144)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.
Behnert, C.; Lewandowski, D.: ¬A framework for designing retrieval effectiveness studies of library information systems using human relevance assessments (2017) 0.01
```
0.01083742 = product of:
  0.02167484 = sum of:
    0.02167484 = product of:
      0.04334968 = sum of:
        0.04334968 = weight(_text_:web in 3700) [ClassicSimilarity], result of:
          0.04334968 = score(doc=3700,freq=4.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.25496176 = fieldWeight in 3700, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3700)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose This paper demonstrates how to apply traditional information retrieval evaluation methods based on standards from the Text REtrieval Conference (TREC) and web search evaluation to all types of modern library information systems including online public access catalogs, discovery systems, and digital libraries that provide web search features to gather information from heterogeneous sources. Design/methodology/approach We apply conventional procedures from information retrieval evaluation to the library information system context considering the specific characteristics of modern library materials. Findings We introduce a framework consisting of five parts: (1) search queries, (2) search results, (3) assessors, (4) testing, and (5) data analysis. We show how to deal with comparability problems resulting from diverse document types, e.g., electronic articles vs. printed monographs and what issues need to be considered for retrieval tests in the library context. Practical implications The framework can be used as a guideline for conducting retrieval effectiveness studies in the library context. Originality/value Although a considerable amount of research has been done on information retrieval evaluation, and standards for conducting retrieval effectiveness studies do exist, to our knowledge this is the first attempt to provide a systematic framework for evaluating the retrieval effectiveness of twenty-first-century library information systems. We demonstrate which issues must be considered and what decisions must be made by researchers prior to a retrieval test.
Sarigil, E.; Sengor Altingovde, I.; Blanco, R.; Barla Cambazoglu, B.; Ozcan, R.; Ulusoy, Ö.: Characterizing, predicting, and handling web search queries that match very few or no results (2018) 0.01
```
0.01083742 = product of:
  0.02167484 = sum of:
    0.02167484 = product of:
      0.04334968 = sum of:
        0.04334968 = weight(_text_:web in 4039) [ClassicSimilarity], result of:
          0.04334968 = score(doc=4039,freq=4.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.25496176 = fieldWeight in 4039, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4039)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.

Reichert, S.; Mayr, P.: Untersuchung von Relevanzeigenschaften in einem kontrollierten Eyetracking-Experiment (2012) 0.01

0.010587957 = product of:
  0.021175914 = sum of:
    0.021175914 = product of:
      0.042351827 = sum of:
        0.042351827 = weight(_text_:22 in 328) [ClassicSimilarity], result of:
          0.042351827 = score(doc=328,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.23214069 = fieldWeight in 328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=328)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2012 19:25:54

Geist, K.: Qualität und Relevanz von bildungsbezogenen Suchergebnissen bei der Suche im Web (2012) 0.01

0.009195855 = product of:
  0.01839171 = sum of:
    0.01839171 = product of:
      0.03678342 = sum of:
        0.03678342 = weight(_text_:web in 570) [ClassicSimilarity], result of:
          0.03678342 = score(doc=570,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.21634221 = fieldWeight in 570, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=570)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Pal, S.; Mitra, M.; Kamps, J.: Evaluation effort, reliability and reusability in XML retrieval (2011) 0.01

0.008823298 = product of:
  0.017646596 = sum of:
    0.017646596 = product of:
      0.03529319 = sum of:
        0.03529319 = weight(_text_:22 in 4197) [ClassicSimilarity], result of:
          0.03529319 = score(doc=4197,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.19345059 = fieldWeight in 4197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4197)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2011 14:20:56

Chu, H.: Factors affecting relevance judgment : a report from TREC Legal track (2011) 0.01

0.008823298 = product of:
  0.017646596 = sum of:
    0.017646596 = product of:
      0.03529319 = sum of:
        0.03529319 = weight(_text_:22 in 4540) [ClassicSimilarity], result of:
          0.03529319 = score(doc=4540,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.19345059 = fieldWeight in 4540, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4540)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 12. 7.2011 18:29:22

Wildemuth, B.; Freund, L.; Toms, E.G.: Untangling search task complexity and difficulty in the context of interactive information retrieval studies (2014) 0.01

0.008823298 = product of:
  0.017646596 = sum of:
    0.017646596 = product of:
      0.03529319 = sum of:
        0.03529319 = weight(_text_:22 in 1786) [ClassicSimilarity], result of:
          0.03529319 = score(doc=1786,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.19345059 = fieldWeight in 1786, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1786)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 4.2015 19:31:22

Rajagopal, P.; Ravana, S.D.; Koh, Y.S.; Balakrishnan, V.: Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment (2019) 0.01

0.008823298 = product of:
  0.017646596 = sum of:
    0.017646596 = product of:
      0.03529319 = sum of:
        0.03529319 = weight(_text_:22 in 5287) [ClassicSimilarity], result of:
          0.03529319 = score(doc=5287,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.19345059 = fieldWeight in 5287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5287)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 1.2015 18:30:22

Borlund, P.: ¬A study of the use of simulated work task situations in interactive information retrieval evaluations : a meta-evaluation (2016) 0.01
```
0.0061305705 = product of:
  0.012261141 = sum of:
    0.012261141 = product of:
      0.024522282 = sum of:
        0.024522282 = weight(_text_:web in 2880) [ClassicSimilarity], result of:
          0.024522282 = score(doc=2880,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.14422815 = fieldWeight in 2880, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=2880)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - The purpose of this paper is to report a study of how the test instrument of a simulated work task situation is used in empirical evaluations of interactive information retrieval (IIR) and reported in the research literature. In particular, the author is interested to learn whether the requirements of how to employ simulated work task situations are followed, and whether these requirements call for further highlighting and refinement. Design/methodology/approach - In order to study how simulated work task situations are used, the research literature in question is identified. This is done partly via citation analysis by use of Web of Science®, and partly by systematic search of online repositories. On this basis, 67 individual publications were identified and they constitute the sample of analysis. Findings - The analysis reveals a need for clarifications of how to use simulated work task situations in IIR evaluations. In particular, with respect to the design and creation of realistic simulated work task situations. There is a lack of tailoring of the simulated work task situations to the test participants. Likewise, the requirement to include the test participants' personal information needs is neglected. Further, there is a need to add and emphasise a requirement to depict the used simulated work task situations when reporting the IIR studies. Research limitations/implications - Insight about the use of simulated work task situations has implications for test design of IIR studies and hence the knowledge base generated on the basis of such studies. Originality/value - Simulated work task situations are widely used in IIR studies, and the present study is the first comprehensive study of the intended and unintended use of this test instrument since its introduction in the late 1990's. The paper addresses the need to carefully design and tailor simulated work task situations to suit the test participants in order to obtain the intended authentic and realistic IIR under study.

Search (11 results, page 1 of 1)

Authors

Languages

Types