Search (198 results, page 1 of 10)

Morse, E.; Lewis, M.; Olsen, K.A.: Testing visual information retrieval methodologies case study : comparative analysis of textual, icon, graphical, and "spring" displays (2002) 0.05

0.052418932 = product of:
  0.104837865 = sum of:
    0.090069 = weight(_text_:interfaces in 191) [ClassicSimilarity], result of:
      0.090069 = score(doc=191,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.40299654 = fieldWeight in 191, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0546875 = fieldNorm(doc=191)
    0.01476886 = product of:
      0.04430658 = sum of:
        0.04430658 = weight(_text_:systems in 191) [ClassicSimilarity], result of:
          0.04430658 = score(doc=191,freq=4.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.33612844 = fieldWeight in 191, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=191)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)

Abstract: Although many different visual information retrieval systems have been proposed, few have been tested, and where testing has been performed, results were often inconclusive. Further, there is very little evidence of benchmarking systems against a common standard. An approach for testing novel interfaces is proposed that uses bottom-up, stepwise testing to allow evaluation of a visualization, itself, rather than restricting evaluation to the system instantiating it. This approach not only makes it easier to control variables, but the tests are also easier to perform. The methodology will be presented through a case study, where a new visualization technique is compared to more traditional ways of presenting data

Spink, A.; Goodrum, A.: ¬A study of search intermediary working notes : implications for IR system design (1996) 0.05

0.05025608 = product of:
  0.10051216 = sum of:
    0.090069 = weight(_text_:interfaces in 6981) [ClassicSimilarity], result of:
      0.090069 = score(doc=6981,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.40299654 = fieldWeight in 6981, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6981)
    0.010443161 = product of:
      0.031329483 = sum of:
        0.031329483 = weight(_text_:systems in 6981) [ClassicSimilarity], result of:
          0.031329483 = score(doc=6981,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.23767869 = fieldWeight in 6981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6981)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)

Abstract: Reports findings from an explanatory study investigating working notes created during encoding and external storage (EES) processes, by human search intermediaries using a Boolean information retrieval systems. Analysis of 221 sets of working notes created by human search intermediaries revealed extensive use of EES processes and the creation of working notes of textual, numerical and graphical entities. Nearly 70% of recorded working noted were textual/numerical entities, nearly 30 were graphical entities and 0,73% were indiscernible. Segmentation devices were also used in 48% of the working notes. The creation of working notes during the EES processes was a fundamental element within the mediated, interactive information retrieval process. Discusses implications for the design of interfaces to support users' EES processes and further research

Davis, C.H.: From document retrieval to Web browsing : some universal concerns (1997) 0.05

0.05025608 = product of:
  0.10051216 = sum of:
    0.090069 = weight(_text_:interfaces in 399) [ClassicSimilarity], result of:
      0.090069 = score(doc=399,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.40299654 = fieldWeight in 399, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0546875 = fieldNorm(doc=399)
    0.010443161 = product of:
      0.031329483 = sum of:
        0.031329483 = weight(_text_:systems in 399) [ClassicSimilarity], result of:
          0.031329483 = score(doc=399,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.23767869 = fieldWeight in 399, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=399)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)

Abstract: Computer based systems can produce enourmous retrieval sets even when good search logic is used. Sometimes this is desirable, more often it is not. Appropriate filters can limit search results, but they represent only a partial solution. Simple ranking techniques are needed that are both effective and easily understood by the humans doing the searching. Optimal search output, whether from a traditional database or the Internet, will result when intuitive interfaces are designed that inspire confidence while making the necessary mathematics transparent. Weighted term searching using powers of 2, a technique proposed early in the history of information retrieval, can be simplifies and used in combination with modern graphics and textual input to achieve these results

Blandford, A.; Adams, A.; Attfield, S.; Buchanan, G.; Gow, J.; Makri, S.; Rimmer, J.; Warwick, C.: ¬The PRET A Rapporter framework : evaluating digital libraries from the perspective of information work (2008) 0.05
```
0.04956404 = product of:
  0.09912808 = sum of:
    0.07720201 = weight(_text_:interfaces in 2021) [ClassicSimilarity], result of:
      0.07720201 = score(doc=2021,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.3454256 = fieldWeight in 2021, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.046875 = fieldNorm(doc=2021)
    0.021926071 = product of:
      0.06577821 = sum of:
        0.06577821 = weight(_text_:systems in 2021) [ClassicSimilarity], result of:
          0.06577821 = score(doc=2021,freq=12.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.4990213 = fieldWeight in 2021, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=2021)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)
```
Abstract

The strongest tradition of IR systems evaluation has focused on system effectiveness; more recently, there has been a growing interest in evaluation of Interactive IR systems, balancing system and user-oriented evaluation criteria. In this paper we shift the focus to considering how IR systems, and particularly digital libraries, can be evaluated to assess (and improve) their fit with users' broader work activities. Taking this focus, we answer a different set of evaluation questions that reveal more about the design of interfaces, user-system interactions and how systems may be deployed in the information working context. The planning and conduct of such evaluation studies share some features with the established methods for conducting IR evaluation studies, but come with a shift in emphasis; for example, a greater range of ethical considerations may be pertinent. We present the PRET A Rapporter framework for structuring user-centred evaluation studies and illustrate its application to three evaluation studies of digital library systems.

Footnote

Beitrag eines Themenbereichs: Evaluation of Interactive Information Retrieval Systems
Meadows, C.J.: ¬A study of user performance and attitudes with information retrieval interfaces (1995) 0.04
```
0.043157235 = product of:
  0.17262894 = sum of:
    0.17262894 = weight(_text_:interfaces in 2674) [ClassicSimilarity], result of:
      0.17262894 = score(doc=2674,freq=10.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.7723952 = fieldWeight in 2674, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.046875 = fieldNorm(doc=2674)
  0.25 = coord(1/4)
```
Abstract

Reports on a project undertaken to compare the behaviour of 2 types of users with 2 types of information retrieval interfaces. The user types were search process specialists and subject matter domain specialists with no prior online database search experience. The interfaces were native DIALOG, which uses a procedural language, and OAK, a largely menu based, hence non procedural language interface communicating with DIALOG. 3 types of data were recorded: logs automatically recorded by computer moitoring of all searches, results of structured interviews with subjects at the time of the searches, and results of focus group discussions after all project tasks were completed. The type of user was determined by a combination of prior training, objective in searching, and subject domain knowledge. The results show that the type of interface does affect performance and users adapt their behaviour to interfaces differently. Different combinations of search experience and domain knowledge will lead to different behaviour in use of an information retrieval system. Different kinds of users can best be served with different kinds of interfaces
Wildemuth, B.M.; Jacob, E.K.; Fullington, A.;; Bliek, R. de; Friedman, C.P.: ¬A detailed analysis of end-user search behaviours (1991) 0.04
```
0.035897203 = product of:
  0.071794406 = sum of:
    0.064335 = weight(_text_:interfaces in 2423) [ClassicSimilarity], result of:
      0.064335 = score(doc=2423,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.28785467 = fieldWeight in 2423, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2423)
    0.007459401 = product of:
      0.022378203 = sum of:
        0.022378203 = weight(_text_:systems in 2423) [ClassicSimilarity], result of:
          0.022378203 = score(doc=2423,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.1697705 = fieldWeight in 2423, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2423)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)
```
Abstract

Search statements in this revision process can be viewed as a 'move' in the overall search strategy. Very little is known about how end users develop and revise their search strategies. A study was conducted to analyse the moves made in 244 data base searches conducted by 26 medical students at the University of North Carolina at Chapel Hill. Students search INQUIRER, a data base of facts and concepts in microbiology. The searches were conducted during a 3-week period in spring 1990 and were recorded by the INQUIRER system. Each search statement was categorised, using Fidel's online searching moves (S. Online review 9(1985) S.61-74) and Bates' search tactics (s. JASIS 30(1979) S.205-214). Further analyses indicated that the most common moves were Browse/Specity, Select Exhaust, Intersect, and Vary, and that selection of moves varied by student and by problem. Analysis of search tactics (combinations of moves) identified 5 common search approaches. The results of this study have implcations for future research on search behaviours, for thedesign of system interfaces and data base structures, and for the training of end users

Source

ASIS'91: systems understanding people. Proc. of the 54th Annual Meeting of the ASIS, vol.28, Washington, DC, 27.-31.10.1991. Ed.: J.-M. Griffiths

King, D.W.; Bryant, E.C.: ¬The evaluation of information services and products (1971) 0.03

0.0321675 = product of:
  0.12867 = sum of:
    0.12867 = weight(_text_:interfaces in 4157) [ClassicSimilarity], result of:
      0.12867 = score(doc=4157,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.57570934 = fieldWeight in 4157, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.078125 = fieldNorm(doc=4157)
  0.25 = coord(1/4)

Content: Covers the evaluative and control aspects of: dclassification and indexing processes and languages; document screening processes; composition, reproduction, acquisition, storage, and presentation; usersystem interfaces. Also contains brief and lucid primers on user surveys, statistics, sampling methods, and experimental design.

Tillotson, J.: Is keyword searching the answer? (1995) 0.02
```
0.02251725 = product of:
  0.090069 = sum of:
    0.090069 = weight(_text_:interfaces in 1857) [ClassicSimilarity], result of:
      0.090069 = score(doc=1857,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.40299654 = fieldWeight in 1857, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1857)
  0.25 = coord(1/4)
```
Abstract

Examines 3 aspects of keyword searching to see if defaulting to keyword searches might serve as a solution to the problems users find when performing subject searches in OPACs. Investigates if keyword searching produces useful results; if people who use keyword searches to find information on a subject report that they are satisfied with the results; and how keyword searching and controlled vocabulary searching are offered and explained in currently available OPAC interfaces. Concludes that both keyword and controlled vocabulary searching ought to be easily available in an OPAC, and that improvements need to be made in explanation and help offered to subject searchers
Spink, A.; Goodrum, A.; Robins, D.: Search intermediary elicitations during mediated online searching (1995) 0.02
```
0.02251725 = product of:
  0.090069 = sum of:
    0.090069 = weight(_text_:interfaces in 3872) [ClassicSimilarity], result of:
      0.090069 = score(doc=3872,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.40299654 = fieldWeight in 3872, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3872)
  0.25 = coord(1/4)
```
Abstract

Investigates search intermediary elicitations during mediated online searching. A study of 40 online reference interviews involving 1.557 search intermediary elicitation, found 15 different types of search intermediary elicitation to users. The elicitation purpose included search terms and strategies, database selection, relevance of retrieved items, users' knowledge and previous information seeking. Analysis of the patterns in the types and sequencing of elicitation showed significant strings of multiple elicitation regarding search terms and strategies, and relevance judgements. Discusses the implications of the findings for training search intermediaries and the design of interfaces eliciting information from end users

Lespinasse, K.: TREC: une conference pour l'evaluation des systemes de recherche d'information (1997) 0.02

0.016187705 = product of:
  0.06475082 = sum of:
    0.06475082 = product of:
      0.09712622 = sum of:
        0.05063609 = weight(_text_:systems in 744) [ClassicSimilarity], result of:
          0.05063609 = score(doc=744,freq=4.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.38414678 = fieldWeight in 744, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=744)
        0.046490133 = weight(_text_:22 in 744) [ClassicSimilarity], result of:
          0.046490133 = score(doc=744,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.30952093 = fieldWeight in 744, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=744)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Abstract: TREC ia an annual conference held in the USA devoted to electronic systems for large full text information searching. The conference deals with evaluation and comparison techniques developed since 1992 by participants from the research and industrial fields. The work of the conference is destined for designers (rather than users) of systems which access full text information. Describes the context, objectives, organization, evaluation methods and limits of TREC
Date: 1. 8.1996 22:01:00

Ellis, D.: Progress and problems in information retrieval (1996) 0.02

0.016187705 = product of:
  0.06475082 = sum of:
    0.06475082 = product of:
      0.09712622 = sum of:
        0.05063609 = weight(_text_:systems in 789) [ClassicSimilarity], result of:
          0.05063609 = score(doc=789,freq=4.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.38414678 = fieldWeight in 789, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=789)
        0.046490133 = weight(_text_:22 in 789) [ClassicSimilarity], result of:
          0.046490133 = score(doc=789,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.30952093 = fieldWeight in 789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=789)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Abstract: An introduction to the principal generic approaches to information retrieval research with their associated concepts, models and systems, this text is designed to keep the information professional up to date with the major themes and developments that have preoccupied researchers in recent month in relation to textual and documentary retrieval systems.
Date: 26. 7.2002 20:22:46

Draper, S.W.; Dunlop, M.D.: New IR - new evaluation : the impact of interaction and multimedia on information retrieval and its evaluation (1997) 0.02
```
0.01608375 = product of:
  0.064335 = sum of:
    0.064335 = weight(_text_:interfaces in 2462) [ClassicSimilarity], result of:
      0.064335 = score(doc=2462,freq=2.0), product of:
        0.22349821 = queryWeight, product of:
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.04289195 = queryNorm
        0.28785467 = fieldWeight in 2462, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.2107263 = idf(docFreq=655, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2462)
  0.25 = coord(1/4)
```
Abstract

The field of information retrieval (IR) traditionally addressed the problem of retrieving text documents from large collections by full text indexing of words. It has always been characterised by a strong focus on evaluation to compare the performance of alternative designs. the emergence into widespread use both of multimedia and of interactive user interfaces has extensive implications for this field and the evaluation methods on which it depends. discusses what we currently understand about those implications. The 'system' being measured must be expanded to include the human users, whose behaviour has a large effect on overall retrieval success, which now depends upon sessions of many retrieval cycles, rather than a single transaction. Multimedia raise issues not only of how users might specify a query in the same medium (e.g. sketch the kind of picture they want), but of cross-medium retrieval. Current explorations in IR evaluation show diversity along at least 2 dimensions. One is that between comprehensive models that have a place for every possible relevant factor, and lightweight methods. The other is that between highly standardised workbench tests avoiding human users vs. workplace studies

Rijsbergen, C.J. van: ¬A test for the separation of relevant and non-relevant documents in experimental retrieval collections (1973) 0.02

0.015567046 = product of:
  0.062268183 = sum of:
    0.062268183 = product of:
      0.093402274 = sum of:
        0.04691214 = weight(_text_:29 in 5002) [ClassicSimilarity], result of:
          0.04691214 = score(doc=5002,freq=2.0), product of:
            0.15088047 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04289195 = queryNorm
            0.31092256 = fieldWeight in 5002, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=5002)
        0.046490133 = weight(_text_:22 in 5002) [ClassicSimilarity], result of:
          0.046490133 = score(doc=5002,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.30952093 = fieldWeight in 5002, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5002)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Date: 19. 3.1996 11:22:12
Source: Journal of documentation. 29(1973) no.3, S.251-257

Ravana, S.D.; Taheri, M.S.; Rajagopal, P.: Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems (2015) 0.01
```
0.014710582 = product of:
  0.058842327 = sum of:
    0.058842327 = product of:
      0.08826349 = sum of:
        0.059207156 = weight(_text_:systems in 2587) [ClassicSimilarity], result of:
          0.059207156 = score(doc=2587,freq=14.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.4491705 = fieldWeight in 2587, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2587)
        0.029056335 = weight(_text_:22 in 2587) [ClassicSimilarity], result of:
          0.029056335 = score(doc=2587,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.19345059 = fieldWeight in 2587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2587)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

Purpose The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries. Design/methodology/approach Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document's weight, which play the role of the mean average precision (MAP) score of the systems as a significance test's statics. The experiments were conducted using the TREC 9 Web track collection. Findings The p-values generated through the two types of significance tests, namely the Student's t-test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores. Originality/value Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.

Date

20. 1.2015 18:30:22

Sanderson, M.: ¬The Reuters test collection (1996) 0.01

0.013715876 = product of:
  0.054863505 = sum of:
    0.054863505 = product of:
      0.082295254 = sum of:
        0.03580512 = weight(_text_:systems in 6971) [ClassicSimilarity], result of:
          0.03580512 = score(doc=6971,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.2716328 = fieldWeight in 6971, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=6971)
        0.046490133 = weight(_text_:22 in 6971) [ClassicSimilarity], result of:
          0.046490133 = score(doc=6971,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.30952093 = fieldWeight in 6971, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6971)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Pal, S.; Mitra, M.; Kamps, J.: Evaluation effort, reliability and reusability in XML retrieval (2011) 0.01
```
0.012302123 = product of:
  0.049208492 = sum of:
    0.049208492 = product of:
      0.07381274 = sum of:
        0.044756405 = weight(_text_:systems in 4197) [ClassicSimilarity], result of:
          0.044756405 = score(doc=4197,freq=8.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.339541 = fieldWeight in 4197, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4197)
        0.029056335 = weight(_text_:22 in 4197) [ClassicSimilarity], result of:
          0.029056335 = score(doc=4197,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.19345059 = fieldWeight in 4197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4197)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

The Initiative for the Evaluation of XML retrieval (INEX) provides a TREC-like platform for evaluating content-oriented XML retrieval systems. Since 2007, INEX has been using a set of precision-recall based metrics for its ad hoc tasks. The authors investigate the reliability and robustness of these focused retrieval measures, and of the INEX pooling method. They explore four specific questions: How reliable are the metrics when assessments are incomplete, or when query sets are small? What is the minimum pool/query-set size that can be used to reliably evaluate systems? Can the INEX collections be used to fairly evaluate "new" systems that did not participate in the pooling process? And, for a fixed amount of assessment effort, would this effort be better spent in thoroughly judging a few queries, or in judging many queries relatively superficially? The authors' findings validate properties of precision-recall-based metrics observed in document retrieval settings. Early precision measures are found to be more error-prone and less stable under incomplete judgments and small topic-set sizes. They also find that system rankings remain largely unaffected even when assessment effort is substantially (but systematically) reduced, and confirm that the INEX collections remain usable when evaluating nonparticipating systems. Finally, they observe that for a fixed amount of effort, judging shallow pools for many queries is better than judging deep pools for a smaller set of queries. However, when judging only a random sample of a pool, it is better to completely judge fewer topics than to partially judge many topics. This result confirms the effectiveness of pooling methods.

Date

22. 1.2011 14:20:56
Rajagopal, P.; Ravana, S.D.; Koh, Y.S.; Balakrishnan, V.: Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment (2019) 0.01
```
0.012302123 = product of:
  0.049208492 = sum of:
    0.049208492 = product of:
      0.07381274 = sum of:
        0.044756405 = weight(_text_:systems in 5287) [ClassicSimilarity], result of:
          0.044756405 = score(doc=5287,freq=8.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.339541 = fieldWeight in 5287, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5287)
        0.029056335 = weight(_text_:22 in 5287) [ClassicSimilarity], result of:
          0.029056335 = score(doc=5287,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.19345059 = fieldWeight in 5287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5287)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

Purpose The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges' involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation. Design/methodology/approach Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents' effort while ensuring outlier scores do not skew the grading of the entire set of documents. Findings The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents. Originality/value Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.

Date

20. 1.2015 18:30:22

Smithson, S.: Information retrieval evaluation in practice : a case study approach (1994) 0.01

0.0120013915 = product of:
  0.048005566 = sum of:
    0.048005566 = product of:
      0.07200835 = sum of:
        0.031329483 = weight(_text_:systems in 7302) [ClassicSimilarity], result of:
          0.031329483 = score(doc=7302,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.23767869 = fieldWeight in 7302, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7302)
        0.040678866 = weight(_text_:22 in 7302) [ClassicSimilarity], result of:
          0.040678866 = score(doc=7302,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.2708308 = fieldWeight in 7302, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7302)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Abstract: The evaluation of information retrieval systems is an important yet difficult operation. This paper describes an exploratory evaluation study that takes an interpretive approach to evaluation. The longitudinal study examines evaluation through the information-seeking behaviour of 22 case studies of 'real' users. The eclectic approach to data collection produced behavioral data that is compared with relevance judgements and satisfaction ratings. The study demonstrates considerable variations among the cases, among different evaluation measures within the same case, and among the same measures at different stages within a single case. It is argued that those involved in evaluation should be aware of the difficulties, and base any evaluation on a good understanding of the cases in question

Blair, D.C.: STAIRS Redux : thoughts on the STAIRS evaluation, ten years after (1996) 0.01

0.0120013915 = product of:
  0.048005566 = sum of:
    0.048005566 = product of:
      0.07200835 = sum of:
        0.031329483 = weight(_text_:systems in 3002) [ClassicSimilarity], result of:
          0.031329483 = score(doc=3002,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.23767869 = fieldWeight in 3002, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3002)
        0.040678866 = weight(_text_:22 in 3002) [ClassicSimilarity], result of:
          0.040678866 = score(doc=3002,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.2708308 = fieldWeight in 3002, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3002)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Abstract: The test of retrieval effectiveness performed on IBM's STAIRS and reported in 'Communications of the ACM' 10 years ago, continues to be cited frequently in the information retrieval literature. The reasons for the study's continuing pertinence to today's research are discussed, and the political, legal, and commercial aspects of the study are presented. In addition, the method of calculating recall that was used in the STAIRS study is discussed in some detail, especially how it reduces the 5 major types of uncertainty in recall estimations. It is also suggested that this method of recall estimation may serve as the basis for recall estimations that might be truly comparable between systems
Source: Journal of the American Society for Information Science. 47(1996) no.1, S.4-22

Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.01

0.0120013915 = product of:
  0.048005566 = sum of:
    0.048005566 = product of:
      0.07200835 = sum of:
        0.031329483 = weight(_text_:systems in 3368) [ClassicSimilarity], result of:
          0.031329483 = score(doc=3368,freq=2.0), product of:
            0.13181444 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04289195 = queryNorm
            0.23767869 = fieldWeight in 3368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3368)
        0.040678866 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
          0.040678866 = score(doc=3368,freq=2.0), product of:
            0.15020029 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04289195 = queryNorm
            0.2708308 = fieldWeight in 3368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3368)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Abstract: The performance of an information retrieval or text and media filtering system may be determined through analytic methods as well as by traditional simulation or experimental methods. These analytic methods can provide precise statements about expected performance. They can thus determine which of 2 similarly performing systems is superior. For both a single query terms and for a multiple query term retrieval model, a model for comparing the performance of different probabilistic retrieval methods is developed. This method may be used in computing the average search length for a query, given only knowledge of database parameter values. Describes predictive models for inverse document frequency, binary independence, and relevance feedback based retrieval and filtering. Simulation illustrate how the single term model performs and sample performance predictions are given for single term and multiple term problems
Date: 22. 2.1996 13:14:10

Search (198 results, page 1 of 10)

Authors

Years

Languages

Types

Themes

Subjects

Classifications