Search (42 results, page 2 of 3)

  • × theme_ss:"Retrievalstudien"
  • × year_i:[2010 TO 2020}
  1. Vechtomova, O.: Facet-based opinion retrieval from blogs (2010) 0.00
    0.0020506454 = product of:
      0.004101291 = sum of:
        0.004101291 = product of:
          0.008202582 = sum of:
            0.008202582 = weight(_text_:a in 4225) [ClassicSimilarity], result of:
              0.008202582 = score(doc=4225,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1544581 = fieldWeight in 4225, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4225)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The paper presents methods of retrieving blog posts containing opinions about an entity expressed in the query. The methods use a lexicon of subjective words and phrases compiled from manually and automatically developed resources. One of the methods uses the Kullback-Leibler divergence to weight subjective words occurring near query terms in documents, another uses proximity between the occurrences of query terms and subjective words in documents, and the third combines both factors. Methods of structuring queries into facets, facet expansion using Wikipedia, and a facet-based retrieval are also investigated in this work. The methods were evaluated using the TREC 2007 and 2008 Blog track topics, and proved to be highly effective.
    Type
    a
  2. Li, J.; Zhang, P.; Song, D.; Wu, Y.: Understanding an enriched multidimensional user relevance model by analyzing query logs (2017) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 3961) [ClassicSimilarity], result of:
              0.008118451 = score(doc=3961,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 3961, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3961)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Modeling multidimensional relevance in information retrieval (IR) has attracted much attention in recent years. However, most existing studies are conducted through relatively small-scale user studies, which may not reflect a real-world and natural search scenario. In this article, we propose to study the multidimensional user relevance model (MURM) on large scale query logs, which record users' various search behaviors (e.g., query reformulations, clicks and dwelling time, etc.) in natural search settings. We advance an existing MURM model (including five dimensions: topicality, novelty, reliability, understandability, and scope) by providing two additional dimensions, that is, interest and habit. The two new dimensions represent personalized relevance judgment on retrieved documents. Further, for each dimension in the enriched MURM model, a set of computable features are formulated. By conducting extensive document ranking experiments on Bing's query logs and TREC session Track data, we systematically investigated the impact of each dimension on retrieval performance and gained a series of insightful findings which may bring benefits for the design of future IR systems.
    Type
    a
  3. Munkelt, J.; Schaer, P.; Lepsky, K.: Towards an IR test collection for the German National Library (2018) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 4311) [ClassicSimilarity], result of:
              0.008118451 = score(doc=4311,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 4311, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4311)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Automatic content indexing is one of the innovations that are increasingly changing the way libraries work. In theory, it promises a cataloguing service that would hardly be possible with humans in terms of speed, quantity and maybe quality. The German National Library (DNB) has also recognised this potential and is increasingly relying on the automatic indexing of their catalogue content. The DNB took a major step in this direction in 2017, which was announced in two papers. The announcement was rather restrained, but the content of the papers is all the more explosive for the library community: Since September 2017, the DNB has discontinued the intellectual indexing of series Band H and has switched to an automatic process for these series. The subject indexing of online publications (series O) has been purely automatical since 2010; from September 2017, monographs and periodicals published outside the publishing industry and university publications will no longer be indexed by people. This raises the question: What is the quality of the automatic indexing compared to the manual work or in other words to which degree can the automatic indexing replace people without a signi cant drop in regards to quality?
    Type
    a
  4. Borlund, P.: ¬A study of the use of simulated work task situations in interactive information retrieval evaluations : a meta-evaluation (2016) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 2880) [ClassicSimilarity], result of:
              0.008118451 = score(doc=2880,freq=18.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 2880, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2880)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - The purpose of this paper is to report a study of how the test instrument of a simulated work task situation is used in empirical evaluations of interactive information retrieval (IIR) and reported in the research literature. In particular, the author is interested to learn whether the requirements of how to employ simulated work task situations are followed, and whether these requirements call for further highlighting and refinement. Design/methodology/approach - In order to study how simulated work task situations are used, the research literature in question is identified. This is done partly via citation analysis by use of Web of Science®, and partly by systematic search of online repositories. On this basis, 67 individual publications were identified and they constitute the sample of analysis. Findings - The analysis reveals a need for clarifications of how to use simulated work task situations in IIR evaluations. In particular, with respect to the design and creation of realistic simulated work task situations. There is a lack of tailoring of the simulated work task situations to the test participants. Likewise, the requirement to include the test participants' personal information needs is neglected. Further, there is a need to add and emphasise a requirement to depict the used simulated work task situations when reporting the IIR studies. Research limitations/implications - Insight about the use of simulated work task situations has implications for test design of IIR studies and hence the knowledge base generated on the basis of such studies. Originality/value - Simulated work task situations are widely used in IIR studies, and the present study is the first comprehensive study of the intended and unintended use of this test instrument since its introduction in the late 1990's. The paper addresses the need to carefully design and tailor simulated work task situations to suit the test participants in order to obtain the intended authentic and realistic IIR under study.
    Type
    a
  5. Kutlu, M.; Elsayed, T.; Lease, M.: Intelligent topic selection for low-cost information retrieval evaluation : a new perspective on deep vs. shallow judging (2018) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 5092) [ClassicSimilarity], result of:
              0.008118451 = score(doc=5092,freq=18.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 5092, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5092)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    While test collections provide the cornerstone for Cranfield-based evaluation of information retrieval (IR) systems, it has become practically infeasible to rely on traditional pooling techniques to construct test collections at the scale of today's massive document collections (e.g., ClueWeb12's 700M+ Webpages). This has motivated a flurry of studies proposing more cost-effective yet reliable IR evaluation methods. In this paper, we propose a new intelligent topic selection method which reduces the number of search topics (and thereby costly human relevance judgments) needed for reliable IR evaluation. To rigorously assess our method, we integrate previously disparate lines of research on intelligent topic selection and deep vs. shallow judging (i.e., whether it is more cost-effective to collect many relevance judgments for a few topics or a few judgments for many topics). While prior work on intelligent topic selection has never been evaluated against shallow judging baselines, prior work on deep vs. shallow judging has largely argued for shallowed judging, but assuming random topic selection. We argue that for evaluating any topic selection method, ultimately one must ask whether it is actually useful to select topics, or should one simply perform shallow judging over many topics? In seeking a rigorous answer to this over-arching question, we conduct a comprehensive investigation over a set of relevant factors never previously studied together: 1) method of topic selection; 2) the effect of topic familiarity on human judging speed; and 3) how different topic generation processes (requiring varying human effort) impact (i) budget utilization and (ii) the resultant quality of judgments. Experiments on NIST TREC Robust 2003 and Robust 2004 test collections show that not only can we reliably evaluate IR systems with fewer topics, but also that: 1) when topics are intelligently selected, deep judging is often more cost-effective than shallow judging in evaluation reliability; and 2) topic familiarity and topic generation costs greatly impact the evaluation cost vs. reliability trade-off. Our findings challenge conventional wisdom in showing that deep judging is often preferable to shallow judging when topics are selected intelligently.
    Type
    a
  6. Leiva-Mederos, A.; Senso, J.A.; Hidalgo-Delgado, Y.; Hipola, P.: Working framework of semantic interoperability for CRIS with heterogeneous data sources (2017) 0.00
    0.001913537 = product of:
      0.003827074 = sum of:
        0.003827074 = product of:
          0.007654148 = sum of:
            0.007654148 = weight(_text_:a in 3706) [ClassicSimilarity], result of:
              0.007654148 = score(doc=3706,freq=16.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14413087 = fieldWeight in 3706, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3706)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources. Design/methodology/approach After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the "dimensions" included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file. Findings The paper also includes an evaluation based on the comparison - by means of calculations of recall and precision - of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information. Originality/value The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.
    Type
    a
  7. Thornley, C.V.; Johnson, A.C.; Smeaton, A.F.; Lee, H.: ¬The scholarly impact of TRECVid (2003-2009) (2011) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 4363) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=4363,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 4363, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4363)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper reports on an investigation into the scholarly impact of the TRECVid (Text Retrieval and Evaluation Conference, Video Retrieval Evaluation) benchmarking conferences between 2003 and 2009. The contribution of TRECVid to research in video retrieval is assessed by analyzing publication content to show the development of techniques and approaches over time and by analyzing publication impact through publication numbers and citation analysis. Popular conference and journal venues for TRECVid publications are identified in terms of number of citations received. For a selection of participants at different career stages, the relative importance of TRECVid publications in terms of citations vis à vis their other publications is investigated. TRECVid, as an evaluation conference, provides data on which research teams 'scored' highly against the evaluation criteria and the relationship between 'top scoring' teams at TRECVid and the 'top scoring' papers in terms of citations is analyzed. A strong relationship was found between 'success' at TRECVid and 'success' at citations both for high scoring and low scoring teams. The implications of the study in terms of the value of TRECVid as a research activity, and the value of bibliometric analysis as a research evaluation tool, are discussed.
    Type
    a
  8. Vakkari, P.; Huuskonen, S.: Search effort degrades search output but improves task outcome (2012) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 46) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=46,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 46, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=46)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We analyzed how effort in searching is associated with search output and task outcome. In a field study, we examined how students' search effort for an assigned learning task was associated with precision and relative recall, and how this was associated to the quality of learning outcome. The study subjects were 41 medical students writing essays for a class in medicine. Searching in Medline was part of their assignment. The data comprised students' search logs in Medline, their assessment of the usefulness of references retrieved, a questionnaire concerning the search process, and evaluation scores of the essays given by the teachers. Pearson correlation was calculated for answering the research questions. Finally, a path model for predicting task outcome was built. We found that effort in the search process degraded precision but improved task outcome. There were two major mechanisms reducing precision while enhancing task outcome. Effort in expanding Medical Subject Heading (MeSH) terms within search sessions and effort in assessing and exploring documents in the result list between the sessions degraded precision, but led to better task outcome. Thus, human effort compensated bad retrieval results on the way to good task outcome. Findings suggest that traditional effectiveness measures in information retrieval should be complemented with evaluation measures for search process and outcome.
    Type
    a
  9. Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 3144) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=3144,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 3144, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3144)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.
    Type
    a
  10. Womser-Hacker, C.: Evaluierung im Information Retrieval (2013) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 728) [ClassicSimilarity], result of:
              0.006765375 = score(doc=728,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 728, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=728)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  11. Naderi, H.; Rumpler, B.: PERCIRS: a system to combine personalized and collaborative information retrieval (2010) 0.00
    0.0015127839 = product of:
      0.0030255679 = sum of:
        0.0030255679 = product of:
          0.0060511357 = sum of:
            0.0060511357 = weight(_text_:a in 3960) [ClassicSimilarity], result of:
              0.0060511357 = score(doc=3960,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11394546 = fieldWeight in 3960, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3960)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - This paper aims to discuss and test the claim that utilization of the personalization techniques can be valuable to improve the efficiency of collaborative information retrieval (CIR) systems. Design/methodology/approach - A new personalized CIR system, called PERCIRS, is presented based on the user profile similarity calculation (UPSC) formulas. To this aim, the paper proposes several UPSC formulas as well as two techniques to evaluate them. As the proposed CIR system is personalized, it could not be evaluated by Cranfield, like evaluation techniques (e.g. TREC). Hence, this paper proposes a new user-centric mechanism, which enables PERCIRS to be evaluated. This mechanism is generic and can be used to evaluate any other personalized IR system. Findings - The results show that among the proposed UPSC formulas in this paper, the (query-document)-graph based formula is the most effective. After integrating this formula into PERCIRS and comparing it with nine other IR systems, it is concluded that the results of the system are better than the other IR systems. In addition, the paper shows that the complexity of the system is less that the complexity of the other CIR systems. Research limitations/implications - This system asks the users to explicitly rank the returned documents, while explicit ranking is still not widespread enough. However it believes that the users should actively participate in the IR process in order to aptly satisfy their needs to information. Originality/value - The value of this paper lies in combining collaborative and personalized IR, as well as introducing a mechanism which enables the personalized IR system to be evaluated. The proposed evaluation mechanism is very valuable for developers of personalized IR systems. The paper also introduces some significant user profile similarity calculation formulas, and two techniques to evaluate them. These formulas can also be used to find the user's community in the social networks.
    Type
    a
  12. Bashir, S.; Rauber, A.: On the relationship between query characteristics and IR functions retrieval bias (2011) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 4628) [ClassicSimilarity], result of:
              0.005858987 = score(doc=4628,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 4628, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4628)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Bias quantification of retrieval functions with the help of document retrievability scores has recently evolved as an important evaluation measure for recall-oriented retrieval applications. While numerous studies have evaluated retrieval bias of retrieval functions, solid validation of its impact on realistic types of queries is still limited. This is due to the lack of well-accepted criteria for query generation for estimating retrievability. Commonly, random queries are used for approximating documents retrievability due to the prohibitively large query space and time involved in processing all queries. Additionally, a cumulative retrievability score of documents over all queries is used for analyzing retrieval functions (retrieval) bias. However, this approach does not consider the difference between different query characteristics (QCs) and their influence on retrieval functions' bias quantification. This article provides an in-depth study of retrievability over different QCs. It analyzes the correlation of lower/higher retrieval bias with different query characteristics. The presence of strong correlation between retrieval bias and query characteristics in experiments indicates the possibility of determining retrieval bias of retrieval functions without processing an exhaustive query set. Experiments are validated on TREC Chemical Retrieval Track consisting of 1.2 million patent documents.
    Type
    a
  13. Ruthven, I.: Relevance behaviour in TREC (2014) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 1785) [ClassicSimilarity], result of:
              0.005858987 = score(doc=1785,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 1785, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1785)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - The purpose of this paper is to examine how various types of TREC data can be used to better understand relevance and serve as test-bed for exploring relevance. The author proposes that there are many interesting studies that can be performed on the TREC data collections that are not directly related to evaluating systems but to learning more about human judgements of information and relevance and that these studies can provide useful research questions for other types of investigation. Design/methodology/approach - Through several case studies the author shows how existing data from TREC can be used to learn more about the factors that may affect relevance judgements and interactive search decisions and answer new research questions for exploring relevance. Findings - The paper uncovers factors, such as familiarity, interest and strictness of relevance criteria, that affect the nature of relevance assessments within TREC, contrasting these against findings from user studies of relevance. Research limitations/implications - The research only considers certain uses of TREC data and assessment given by professional relevance assessors but motivates further exploration of the TREC data so that the research community can further exploit the effort involved in the construction of TREC test collections. Originality/value - The paper presents an original viewpoint on relevance investigations and TREC itself by motivating TREC as a source of inspiration on understanding relevance rather than purely as a source of evaluation material.
    Type
    a
  14. Dang, E.K.F.; Luk, R.W.P.; Allan, J.: ¬A context-dependent relevance model (2016) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 2778) [ClassicSimilarity], result of:
              0.005858987 = score(doc=2778,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 2778, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2778)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Numerous past studies have demonstrated the effectiveness of the relevance model (RM) for information retrieval (IR). This approach enables relevance or pseudo-relevance feedback to be incorporated within the language modeling framework of IR. In the traditional RM, the feedback information is used to improve the estimate of the query language model. In this article, we introduce an extension of RM in the setting of relevance feedback. Our method provides an additional way to incorporate feedback via the improvement of the document language models. Specifically, we make use of the context information of known relevant and nonrelevant documents to obtain weighted counts of query terms for estimating the document language models. The context information is based on the words (unigrams or bigrams) appearing within a text window centered on query terms. Experiments on several Text REtrieval Conference (TREC) collections show that our context-dependent relevance model can improve retrieval performance over the baseline RM. Together with previous studies within the BM25 framework, our current study demonstrates that the effectiveness of our method for using context information in IR is quite general and not limited to any specific retrieval model.
    Type
    a
  15. Mandl, T.: Evaluierung im Information Retrieval : die Hildesheimer Antwort auf aktuelle Herausforderungen der globalisierten Informationsgesellschaft (2010) 0.00
    0.001353075 = product of:
      0.00270615 = sum of:
        0.00270615 = product of:
          0.0054123 = sum of:
            0.0054123 = weight(_text_:a in 4011) [ClassicSimilarity], result of:
              0.0054123 = score(doc=4011,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10191591 = fieldWeight in 4011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4011)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  16. Lu, K.; Kipp, M.E.I.: Understanding the retrieval effectiveness of collaborative tags and author keywords in different retrieval environments : an experimental study on medical collections (2014) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 1215) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=1215,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 1215, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1215)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This study investigates the retrieval effectiveness of collaborative tags and author keywords in different environments through controlled experiments. Three test collections were built. The first collection tests the impact of tags on retrieval performance when only the title and abstract are available (the abstract environment). The second tests the impact of tags when the full text is available (the full-text environment). The third compares the retrieval effectiveness of tags and author keywords in the abstract environment. In addition, both single-word queries and phrase queries are tested to understand the impact of different query types. Our findings suggest that including tags and author keywords in indexes can enhance recall but may improve or worsen average precision depending on retrieval environments and query types. Indexing tags and author keywords for searching using phrase queries in the abstract environment showed improved average precision, whereas indexing tags for searching using single-word queries in the full-text environment led to a significant drop in average precision. The comparison between tags and author keywords in the abstract environment indicates that they have comparable impact on average precision, but author keywords are more advantageous in enhancing recall. The findings from this study provide useful implications for designing retrieval systems that incorporate tags and author keywords.
    Type
    a
  17. Tamine, L.; Chouquet, C.; Palmer, T.: Analysis of biomedical and health queries : lessons learned from TREC and CLEF evaluation benchmarks (2015) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 2341) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=2341,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 2341, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2341)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A large body of research work examined, from both the query side and the user behavior side, the characteristics of medical- and health-related searches. One of the core issues in medical information retrieval (IR) is diversity of tasks that lead to diversity of categories of information needs and queries. From the evaluation perspective, another related and challenging issue is the limited availability of appropriate test collections allowing the experimental validation of medically task oriented IR techniques and systems. In this paper, we explore the peculiarities of TREC and CLEF medically oriented tasks and queries through the analysis of the differences and the similarities between queries across tasks, with respect to length, specificity, and clarity features and then study their effect on retrieval performance. We show that, even for expert oriented queries, language specificity level varies significantly across tasks as well as search difficulty. Additional findings highlight that query clarity factors are task dependent and that query terms specificity based on domain-specific terminology resources is not significantly linked to term rareness in the document collection. The lessons learned from our study could serve as starting points for the design of future task-based medical information retrieval frameworks.
    Type
    a
  18. White, H.D.: Relevance theory and distributions of judgments in document retrieval (2017) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 5099) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=5099,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 5099, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5099)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article extends relevance theory (RT) from linguistic pragmatics into information retrieval. Using more than 50 retrieval experiments from the literature as examples, it applies RT to explain the frequency distributions of documents on relevance scales with three or more points. The scale points, which judges in experiments must consider in addition to queries and documents, are communications from researchers. In RT, the relevance of a communication varies directly with its cognitive effects and inversely with the effort of processing it. Researchers define and/or label the scale points to measure the cognitive effects of documents on judges. However, they apparently assume that all scale points as presented are equally easy for judges to process. Yet the notion that points cost variable effort explains fairly well the frequency distributions of judgments across them. By hypothesis, points that cost more effort are chosen by judges less frequently. Effort varies with the vagueness or strictness of scale-point labels and definitions. It is shown that vague scales tend to produce U- or V-shaped distributions, while strict scales tend to produce right-skewed distributions. These results reinforce the paper's more general argument that RT clarifies the concept of relevance in the dialogues of retrieval evaluation.
    Type
    a
  19. Becks, D.; Mandl, T.; Womser-Hacker, C.: Spezielle Anforderungen bei der Evaluierung von Patent-Retrieval-Systemen (2010) 0.00
    0.0011839407 = product of:
      0.0023678814 = sum of:
        0.0023678814 = product of:
          0.0047357627 = sum of:
            0.0047357627 = weight(_text_:a in 4667) [ClassicSimilarity], result of:
              0.0047357627 = score(doc=4667,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.089176424 = fieldWeight in 4667, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4667)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  20. Munkelt, J.: Erstellung einer DNB-Retrieval-Testkollektion (2018) 0.00
    0.0011839407 = product of:
      0.0023678814 = sum of:
        0.0023678814 = product of:
          0.0047357627 = sum of:
            0.0047357627 = weight(_text_:a in 4310) [ClassicSimilarity], result of:
              0.0047357627 = score(doc=4310,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.089176424 = fieldWeight in 4310, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4310)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a

Languages

  • e 35
  • d 7

Types