Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Vegt, A. van der ; Zuccon, G. ; Koopman, B.: Do better search engines really equate to better clinical decisions? : If not, why not?.
In: Journal of the Association for Information Science and Technology. 72(2021) no.2, S.141-155.
Abstract: Previous research has found that improved search engine effectiveness-evaluated using a batch-style approach-does not always translate to significant improvements in user task performance; however, these prior studies focused on simple recall and precision-based search tasks. We investigated the same relationship, but for realistic, complex search tasks required in clinical decision making. One hundred and nine clinicians and final year medical students answered 16 clinical questions. Although the search engine did improve answer accuracy by 20 percentage points, there was no significant difference when participants used a more effective, state-of-the-art search engine. We also found that the search engine effectiveness difference, identified in the lab, was diminished by around 70% when the search engines were used with real users. Despite the aid of the search engine, half of the clinical questions were answered incorrectly. We further identified the relative contribution of search engine effectiveness to the overall end task success. We found that the ability to interpret documents correctly was a much more important factor impacting task success. If these findings are representative, information retrieval research may need to reorient its emphasis towards helping users to better understand information, rather than just finding it for them.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24398.
Themenfeld: Suchmaschinen ; Retrievalstudien
2Kholghi, M. ; Vine, L.D. ; Sitbon, L. ; Zuccon, G. ; Nguyen, A.: Clinical information extraction using small data : an active learning approach based on sequence representations and word embeddings.
In: Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2543-2556.
Abstract: This article demonstrates the benefits of using sequence representations based on word embeddings to inform the seed selection and sample selection processes in an active learning pipeline for clinical information extraction. Seed selection refers to choosing an initial sample set to label to form an initial learning model. Sample selection refers to selecting informative samples to update the model at each iteration of the active learning process. Compared to supervised machine learning approaches, active learning offers the opportunity to build statistical classifiers with a reduced amount of training samples that require manual annotation. Reducing the manual annotation effort can support automating the clinical information extraction process. This is particularly beneficial in the clinical domain, where manual annotation is a time-consuming and costly task, as it requires extensive labor from clinical experts. Our empirical findings demonstrate that (a) using sequence representations along with the length of sequence for seed selection shows potential towards more effective initial models, and (b) using sequence representations for sample selection leads to significantly lower manual annotation efforts, with up to 3% and 6% fewer tokens and concepts requiring annotation, respectively, compared to state-of-the-art query strategies.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23936/full.
Anmerkung: Beitrag in einem Special issue on biomedical information retrieval.
3Koopman, B. ; Zuccon, G. ; Bruza, P. ; Nguyen, A.: What makes an effective clinical query and querier?.
In: Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2557-2571.
Abstract: In this paper, we perform an in-depth study into how clinicians represent their information needs and the influence this has on information retrieval (IR) effectiveness. While much research in IR has considered the effectiveness of IR systems, there is still a significant gap in the understanding of how users contribute to the effectiveness of these systems. The paper aims to contribute to this by studying how clinicians search for information. Multiple representations of an information need-from verbose patient case descriptions to ad-hoc queries-were considered in order to understand their effect on retrieval. Four clinicians provided queries and performed relevance assessment to form a test collection used in this study. The different query formulation strategies of each clinician, and their effectiveness, were investigated. The results show that query formulation had more impact on retrieval effectiveness than the particular retrieval systems used. The most effective queries were short, ad-hoc keyword queries. Different clinicians were observed to consistently adopt specific query formulation strategies. The most effective queriers were those who, given their information need, inferred novel keywords most likely to appear in relevant documents. This study reveals aspects of how people search within the clinical domain. This can help inform the development of new models and methods that specifically focus on the query formulation process to improve retrieval effectiveness.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23959/full.
Anmerkung: Beitrag in einem Special issue on biomedical information retrieval.
4Koopman, B. ; Zuccon, G. ; Bruza, P. ; Sitbon, L. ; Lawley, M.: Information retrieval as semantic inference : a graph Inference model applied to medical search.
In: Information Retrieval Journal. 19(2016) no.1, S.6-37.
Abstract: This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem-the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.
Inhalt: Vgl.: DOI: 10.1007/s10791-015-9268-9.
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval ; Wissensrepräsentation
5Symonds, M. ; Bruza, P. ; Zuccon, G. ; Koopman, B. ; Sitbon, L. ; Turner, I.: Automatic query expansion : a structural linguistic perspective.
In: Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596.
Abstract: A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval ; Computerlinguistik ; Retrievalalgorithmen