Search (397 results, page 19 of 20)

  • × theme_ss:"Retrievalstudien"
  1. Lioma, C.; Ounis, I.: ¬A syntactically-based query reformulation technique for information retrieval (2008) 0.00
    2.401893E-4 = product of:
      0.004083218 = sum of:
        0.004083218 = weight(_text_:in in 2031) [ClassicSimilarity], result of:
          0.004083218 = score(doc=2031,freq=8.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.120230645 = fieldWeight in 2031, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=2031)
      0.05882353 = coord(1/17)
    
    Abstract
    Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.
  2. Tague-Sutcliffe, J.: Information retrieval experimentation (2009) 0.00
    2.401893E-4 = product of:
      0.004083218 = sum of:
        0.004083218 = weight(_text_:in in 3801) [ClassicSimilarity], result of:
          0.004083218 = score(doc=3801,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.120230645 = fieldWeight in 3801, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=3801)
      0.05882353 = coord(1/17)
    
    Abstract
    Jean Tague-Sutcliffe was an important figure in information retrieval experimentation. Here, she reviews the history of IR research, and provides a description of the fundamental paradigm of information retrieval experimentation that continues to dominate the field.
  3. Carterette, B.: Test collections (2009) 0.00
    2.401893E-4 = product of:
      0.004083218 = sum of:
        0.004083218 = weight(_text_:in in 3891) [ClassicSimilarity], result of:
          0.004083218 = score(doc=3891,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.120230645 = fieldWeight in 3891, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=3891)
      0.05882353 = coord(1/17)
    
    Abstract
    Research and development of search engines and other information retrieval (IR) systems proceeds by a cycle of design, implementation, and experimentation, with the results of each experiment influencing design decisions in the next iteration of the cycle. Batch experiments on test collections help ensure that this process goes as smoothly and as quickly as possible. A test collection comprises a collection of documents, a set of information needs, and judgments of the relevance of documents to those needs.
  4. Kutlu, M.; Elsayed, T.; Lease, M.: Intelligent topic selection for low-cost information retrieval evaluation : a new perspective on deep vs. shallow judging (2018) 0.00
    2.401893E-4 = product of:
      0.004083218 = sum of:
        0.004083218 = weight(_text_:in in 5092) [ClassicSimilarity], result of:
          0.004083218 = score(doc=5092,freq=8.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.120230645 = fieldWeight in 5092, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=5092)
      0.05882353 = coord(1/17)
    
    Abstract
    While test collections provide the cornerstone for Cranfield-based evaluation of information retrieval (IR) systems, it has become practically infeasible to rely on traditional pooling techniques to construct test collections at the scale of today's massive document collections (e.g., ClueWeb12's 700M+ Webpages). This has motivated a flurry of studies proposing more cost-effective yet reliable IR evaluation methods. In this paper, we propose a new intelligent topic selection method which reduces the number of search topics (and thereby costly human relevance judgments) needed for reliable IR evaluation. To rigorously assess our method, we integrate previously disparate lines of research on intelligent topic selection and deep vs. shallow judging (i.e., whether it is more cost-effective to collect many relevance judgments for a few topics or a few judgments for many topics). While prior work on intelligent topic selection has never been evaluated against shallow judging baselines, prior work on deep vs. shallow judging has largely argued for shallowed judging, but assuming random topic selection. We argue that for evaluating any topic selection method, ultimately one must ask whether it is actually useful to select topics, or should one simply perform shallow judging over many topics? In seeking a rigorous answer to this over-arching question, we conduct a comprehensive investigation over a set of relevant factors never previously studied together: 1) method of topic selection; 2) the effect of topic familiarity on human judging speed; and 3) how different topic generation processes (requiring varying human effort) impact (i) budget utilization and (ii) the resultant quality of judgments. Experiments on NIST TREC Robust 2003 and Robust 2004 test collections show that not only can we reliably evaluate IR systems with fewer topics, but also that: 1) when topics are intelligently selected, deep judging is often more cost-effective than shallow judging in evaluation reliability; and 2) topic familiarity and topic generation costs greatly impact the evaluation cost vs. reliability trade-off. Our findings challenge conventional wisdom in showing that deep judging is often preferable to shallow judging when topics are selected intelligently.
  5. Balog, K.; Schuth, A.; Dekker, P.; Tavakolpoursaleh, N.; Schaer, P.; Chuang, P.-Y.: Overview of the TREC 2016 Open Search track Academic Search Edition (2016) 0.00
    2.401893E-4 = product of:
      0.004083218 = sum of:
        0.004083218 = weight(_text_:in in 43) [ClassicSimilarity], result of:
          0.004083218 = score(doc=43,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.120230645 = fieldWeight in 43, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=43)
      0.05882353 = coord(1/17)
    
    Abstract
    We present the TREC Open Search track, which represents a new evaluation paradigm for information retrieval. It offers the possibility for researchers to evaluate their approaches in a live setting, with real, unsuspecting users of an existing search engine. The first edition of the track focuses on the academic search domain and features the ad-hoc scientific literature search task. We report on experiments with three different academic search engines: Cite-SeerX, SSOAR, and Microsoft Academic Search.
  6. Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 4152) [ClassicSimilarity], result of:
          0.003609089 = score(doc=4152,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.05882353 = coord(1/17)
    
    Abstract
    A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
  7. MacCall, S.L.; Cleveland, A.D.; Gibson, I.E.: Outline and preliminary evaluation of the classical digital library model (1999) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 6541) [ClassicSimilarity], result of:
          0.003609089 = score(doc=6541,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 6541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6541)
      0.05882353 = coord(1/17)
    
    Abstract
    The growing number of networked information resources and services offers unprecedented opportunities for delivering high quality information to the computer desktop of a wide range of individuals. However, currently there is a reliance on a database retrieval model, in which endusers use keywords to search large collections of automatically indexed resources in order to find needed information. As an alternative to the database retrieval model, this paper outlines the classical digital library model, which is derived from traditional practices of library and information science professionals. These practices include the selection and organization of information resources for local populations of users and the integration of advanced information retrieval tools, such as databases and the Internet into these collections. To evaluate this model, library and information professionals and endusers involved with primary care medicine were asked to respond to a series of questions comparing their experiences with a digital library developed for the primary care population to their experiences with general Internet use. Preliminary results are reported
  8. Draper, S.W.; Dunlop, M.D.: New IR - new evaluation : the impact of interaction and multimedia on information retrieval and its evaluation (1997) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 2462) [ClassicSimilarity], result of:
          0.003609089 = score(doc=2462,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 2462, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2462)
      0.05882353 = coord(1/17)
    
    Abstract
    The field of information retrieval (IR) traditionally addressed the problem of retrieving text documents from large collections by full text indexing of words. It has always been characterised by a strong focus on evaluation to compare the performance of alternative designs. the emergence into widespread use both of multimedia and of interactive user interfaces has extensive implications for this field and the evaluation methods on which it depends. discusses what we currently understand about those implications. The 'system' being measured must be expanded to include the human users, whose behaviour has a large effect on overall retrieval success, which now depends upon sessions of many retrieval cycles, rather than a single transaction. Multimedia raise issues not only of how users might specify a query in the same medium (e.g. sketch the kind of picture they want), but of cross-medium retrieval. Current explorations in IR evaluation show diversity along at least 2 dimensions. One is that between comprehensive models that have a place for every possible relevant factor, and lightweight methods. The other is that between highly standardised workbench tests avoiding human users vs. workplace studies
  9. Borlund, P.: Experimental components for the evaluation of interactive information retrieval systems (2000) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 4549) [ClassicSimilarity], result of:
          0.003609089 = score(doc=4549,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 4549, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4549)
      0.05882353 = coord(1/17)
    
    Abstract
    This paper presents a set of basic components which constitutes the experimental setting intended for the evaluation of interactive information retrieval (IIR) systems, the aim of which is to facilitate evaluation of IIR systems in a way which is as close as possible to realistic IR processes. The experimental settings consists of 3 components: (1) the involvement of potential users as test persons; (2) the application of dynamic and individual information needs; and (3) the use of multidimensionsal and dynamic relevance judgements. Hidden under the information need component is the essential central sub-component, the simulated work task situation, the tool that triggers the (simulated) dynamic information need. This paper also reports on the empirical findings of the meta-evaluation of the application of this sub-component, the purpose of which is to discover whether the application of simulated work task situations to future evaluation of IIR systems can be recommended. Investigations are carried out to dertermine whether any search behavioural differences exist between test persons' treatment of their own real information needs versus simulated information needs. The hypothesis is that if no difference exist one can correctly substitute real information needs with simulated information needs through the application of simulated work task situations. The empirical results of the meta-evaluation provide positive evidence for the application of simulated work task situations to the evaluation of IIR systems. The results also indicate that tailoring work task situations to the group of test persons is important in motivating them. Furthermore, the results of the evaluation show that different versions of semantic openness of the simulated situations make no difference to the test persons' search treatment
  10. Sun, Y.; Kantor, P.B.: Cross-evaluation : a new model for information system evaluation (2006) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 5048) [ClassicSimilarity], result of:
          0.003609089 = score(doc=5048,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 5048, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5048)
      0.05882353 = coord(1/17)
    
    Abstract
    In this article, we introduce a new information system evaluation method and report on its application to a collaborative information seeking system, AntWorld. The key innovation of the new method is to use precisely the same group of users who work with the system as judges, a system we call Cross-Evaluation. In the new method, we also propose to assess the system at the level of task completion. The obvious potential limitation of this method is that individuals may be inclined to think more highly of the materials that they themselves have found and are almost certain to think more highly of their own work product than they do of the products built by others. The keys to neutralizing this problem are careful design and a corresponding analytical model based on analysis of variance. We model the several measures of task completion with a linear model of five effects, describing the users who interact with the system, the system used to finish the task, the task itself, the behavior of individuals as judges, and the selfjudgment bias. Our analytical method successfully isolates the effect of each variable. This approach provides a successful model to make concrete the "threerealities" paradigm, which calls for "real tasks," "real users," and "real systems."
  11. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Order-based fitness functions for genetic algorithms applied to relevance feedback (2003) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 5154) [ClassicSimilarity], result of:
          0.003609089 = score(doc=5154,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 5154, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5154)
      0.05882353 = coord(1/17)
    
    Abstract
    Lopez-Pujalte and Guerrero-Bote test a relevance feedback genetic algorithm while varying its order based fitness functions and generating a function based upon the Ide dec-hi method as a base line. Using the non-zero weighted term types assigned to the query, and to the initially retrieved set of documents, as genes, a chromosome of equal length is created for each. The algorithm is provided with the chromosomes for judged relevant documents, for judged irrelevant documents, and for the irrelevant documents with their terms negated. The algorithm uses random selection of all possible genes, but gives greater likelihood to those with higher fitness values. When the fittest chromosome of a previous population is eliminated it is restored while the least fittest of the new population is eliminated in its stead. A crossover probability of .8 and a mutation probability of .2 were used with 20 generations. Three fitness functions were utilized; the Horng and Yeh function which takes into account the position of relevant documents, and two new functions, one based on accumulating the cosine similarity for retrieved documents, the other on stored fixed-recall-interval precessions. The Cranfield collection was used with the first 15 documents retrieved from 33 queries chosen to have at least 3 relevant documents in the first 15 and at least 5 relevant documents not initially retrieved. Precision was calculated at fixed recall levels using the residual collection method which removes viewed documents. One of the three functions improved the original retrieval by127 percent, while the Ide dec-hi method provided a 120 percent improvement.
  12. Díaz, A.; García, A.; Gervás, P.: User-centred versus system-centred evaluation of a personalization system (2008) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 2094) [ClassicSimilarity], result of:
          0.003609089 = score(doc=2094,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 2094, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2094)
      0.05882353 = coord(1/17)
    
    Abstract
    Some of the most popular measures to evaluate information filtering systems are usually independent of the users because they are based in relevance judgments obtained from experts. On the other hand, the user-centred evaluation allows showing the different impressions that the users have perceived about the system running. This work is focused on discussing the problem of user-centred versus system-centred evaluation of a Web content personalization system where the personalization is based on a user model that stores long term (section, categories and keywords) and short term interests (adapted from user provided feedback). The user-centred evaluation is based on questionnaires filled in by the users before and after using the system and the system-centred evaluation is based on the comparison between ranking of documents, obtained from the application of a multi-tier selection process, and binary relevance judgments collected previously from real users. The user-centred and system-centred evaluations performed with 106 users during 14 working days have provided valuable data concerning the behaviour of the users with respect to issues such as document relevance or the relative importance attributed to different ways of personalization. The results obtained shows general satisfaction on both the personalization processes (selection, adaptation and presentation) and the system as a whole.
  13. Behnert, C.; Lewandowski, D.: ¬A framework for designing retrieval effectiveness studies of library information systems using human relevance assessments (2017) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 3700) [ClassicSimilarity], result of:
          0.003609089 = score(doc=3700,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 3700, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3700)
      0.05882353 = coord(1/17)
    
    Abstract
    Purpose This paper demonstrates how to apply traditional information retrieval evaluation methods based on standards from the Text REtrieval Conference (TREC) and web search evaluation to all types of modern library information systems including online public access catalogs, discovery systems, and digital libraries that provide web search features to gather information from heterogeneous sources. Design/methodology/approach We apply conventional procedures from information retrieval evaluation to the library information system context considering the specific characteristics of modern library materials. Findings We introduce a framework consisting of five parts: (1) search queries, (2) search results, (3) assessors, (4) testing, and (5) data analysis. We show how to deal with comparability problems resulting from diverse document types, e.g., electronic articles vs. printed monographs and what issues need to be considered for retrieval tests in the library context. Practical implications The framework can be used as a guideline for conducting retrieval effectiveness studies in the library context. Originality/value Although a considerable amount of research has been done on information retrieval evaluation, and standards for conducting retrieval effectiveness studies do exist, to our knowledge this is the first attempt to provide a systematic framework for evaluating the retrieval effectiveness of twenty-first-century library information systems. We demonstrate which issues must be considered and what decisions must be made by researchers prior to a retrieval test.
  14. Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2018) 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 4300) [ClassicSimilarity], result of:
          0.003609089 = score(doc=4300,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 4300, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4300)
      0.05882353 = coord(1/17)
    
    Abstract
    Gross et al. (2015) have demonstrated that about a quarter of hits would typically be lost to keyword searchers if contemporary academic library catalogs dropped their controlled subject headings. This article reports on an investigation of the search value that subject descriptors and identifiers assigned by professional indexers add to a bibliographic database, namely the Australian Education Index (AEI). First, a similar methodology to that developed by Gross et al. (2015) was applied, with keyword searches representing a range of educational topics run on the AEI database with and without its subject indexing. The results indicated that AEI users would also lose, on average, about a quarter of hits per query. Second, an alternative research design was applied in which an experienced literature searcher was asked to find resources on a set of educational topics on an AEI database stripped of its subject indexing and then asked to search for additional resources on the same topics after the subject indexing had been reinserted. In this study, the proportion of additional resources that would have been lost had it not been for the subject indexing was again found to be about a quarter of the total resources found for each topic, on average.
  15. Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.00
    2.1229935E-4 = product of:
      0.003609089 = sum of:
        0.003609089 = weight(_text_:in in 4309) [ClassicSimilarity], result of:
          0.003609089 = score(doc=4309,freq=4.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10626988 = fieldWeight in 4309, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
      0.05882353 = coord(1/17)
    
    Abstract
    Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.
  16. Van der Walt, H.E.A.; Brakel, P.A. van: Method for the evaluation of the retrieval effectiveness of a CD-ROM bibliographic database (1991) 0.00
    2.1016564E-4 = product of:
      0.0035728158 = sum of:
        0.0035728158 = weight(_text_:in in 3114) [ClassicSimilarity], result of:
          0.0035728158 = score(doc=3114,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10520181 = fieldWeight in 3114, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3114)
      0.05882353 = coord(1/17)
    
    Abstract
    Addresses the problem of how potential users of CD-ROM data bases can objectively establish which version of the same data base is best suited for a specific situation. The problem was solved by applying the retrieval effectiveness of current on-line data base search systems as a standard measurement. 5 search queries from the medical sciences were presented by experienced users of MEDLINE. Search strategies were written for both DIALOG and DATA-STAR. Search results were compared to create a recall base from documents present in both on-line searches. This recall base was then used to establish the retrieval and precision of 4 CD-ROM data bases: MEDLINE, Compact Cambrdge MEDLINE, DIALOG OnDisc, Comprehensive MEDLINE/EBSCO
  17. Allen, B.: Logical reasoning and retrieval performance (1993) 0.00
    2.1016564E-4 = product of:
      0.0035728158 = sum of:
        0.0035728158 = weight(_text_:in in 5093) [ClassicSimilarity], result of:
          0.0035728158 = score(doc=5093,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10520181 = fieldWeight in 5093, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5093)
      0.05882353 = coord(1/17)
    
    Abstract
    Tests the logical reasoning ability of end users of a CD-ROM index and assesses associations between different levels of this ability and aspects of retrieval performance. Users' selection of vocabulary and their selection of citations for further examination are both influenced by this ability. The designs of information systems should address the effects of logical reasoning on search behaviour. People with lower levels of logical reasoning ability may experience difficulty using systems in which user selectivity plays an important role. Other systems, such as those with ranked output, may decrease the need for users to make selections and would be easier to use for people with lower levels of logical reasoning ability
  18. Large, A.; Beheshti, J.; Breuleux, A.: ¬A comparison of information retrieval from print and CD-ROM versions of an encyclopedia by elementary school students (1994) 0.00
    2.1016564E-4 = product of:
      0.0035728158 = sum of:
        0.0035728158 = weight(_text_:in in 1015) [ClassicSimilarity], result of:
          0.0035728158 = score(doc=1015,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10520181 = fieldWeight in 1015, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1015)
      0.05882353 = coord(1/17)
    
    Abstract
    Describes an experiment using 48 sixth-grade students to compare retrieval techniques using the print and CD-ROM versions of Compton's Encyclopedia. Four queries of defferent complexity (measured by the numer of terms present) were searched by the students after a short training session. The searches were timed and the retrieval steps and search terms were noted. The searches were no faster on the CD-ROM than the print version, but in both cases time was related directly to the number of terms involved. The students coped well with the CD-ROM interface and its several retrieval paths
  19. Salton, G.: Thoughts about modern retrieval technologies (1988) 0.00
    2.1016564E-4 = product of:
      0.0035728158 = sum of:
        0.0035728158 = weight(_text_:in in 1522) [ClassicSimilarity], result of:
          0.0035728158 = score(doc=1522,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10520181 = fieldWeight in 1522, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1522)
      0.05882353 = coord(1/17)
    
    Abstract
    Paper presented at the 30th Annual Conference of the National Federation of Astracting and Information Services, Philadelphia, 28 Feb-2 Mar 88. In recent years, the amount and the variety of available machine-readable data, new technologies have been introduced, such as high density storage devices, and fancy graphic displays useful for information transformation and access. New approaches have also been considered for processing the stored data based on the construction of knowledge bases representing the contents and structure of the information, and the use of expert system techniques to control the user-system interactions. Provides a brief evaluation of the new information processing technologies, and of the software methods proposed for information manipulation.
  20. Spink, A.; Goodrum, A.: ¬A study of search intermediary working notes : implications for IR system design (1996) 0.00
    2.1016564E-4 = product of:
      0.0035728158 = sum of:
        0.0035728158 = weight(_text_:in in 6981) [ClassicSimilarity], result of:
          0.0035728158 = score(doc=6981,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.10520181 = fieldWeight in 6981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6981)
      0.05882353 = coord(1/17)
    
    Abstract
    Reports findings from an explanatory study investigating working notes created during encoding and external storage (EES) processes, by human search intermediaries using a Boolean information retrieval systems. Analysis of 221 sets of working notes created by human search intermediaries revealed extensive use of EES processes and the creation of working notes of textual, numerical and graphical entities. Nearly 70% of recorded working noted were textual/numerical entities, nearly 30 were graphical entities and 0,73% were indiscernible. Segmentation devices were also used in 48% of the working notes. The creation of working notes during the EES processes was a fundamental element within the mediated, interactive information retrieval process. Discusses implications for the design of interfaces to support users' EES processes and further research

Languages

Types

  • a 364
  • s 12
  • el 10
  • m 8
  • r 8
  • x 6
  • p 1
  • More… Less…