Diese Datenbank enthält ca. 39.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 19. Oktober 2016)
1Golub, K. ; Soergel, D. ; Buchanan, G. ; Tudhope, D. ; Lykke, M. ; Hiom, D.: ¬A framework for evaluating automatic indexing or classification in the context of retrieval.
In: Journal of the Association for Information Science and Technology. 67(2016) no.1, S.3-16.
(Advances in information science)
Abstract: Tools for automatic subject assignment help deal with scale and sustainability in creating and enriching metadata, establishing more connections across and between resources and enhancing consistency. Although some software vendors and experimental researchers claim the tools can replace manual subject indexing, hard scientific evidence of their performance in operating information environments is scarce. A major reason for this is that research is usually conducted in laboratory conditions, excluding the complexities of real-life systems and situations. The article reviews and discusses issues with existing evaluation approaches such as problems of aboutness and relevance assessments, implying the need to use more than a single "gold standard" method when evaluating indexing and retrieval, and proposes a comprehensive evaluation framework. The framework is informed by a systematic review of the literature on evaluation approaches: evaluating indexing quality directly through assessment by an evaluator or through comparison with a gold standard, evaluating the quality of computer-assisted indexing directly in the context of an indexing workflow, and evaluating indexing quality indirectly through analyzing retrieval performance.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23600/abstract.
Themenfeld: Automatisches Indexieren ; Automatisches Klassifizieren
2Lykke, M. ; Lund, H. ; Skov, M.: User-driven CHAOS : tags and annotations in radio broadcast research.
In: Knowledge organization. 43(2016) no.2, S.73-85.
Abstract: CHAOS (Cultural Heritage Archive Open System) provides streaming access to more than 500,000 broadcasts by the Danish Broadcast Corporation from 1931 and onwards. The archive is part of the LARM project with the purpose of enabling researchers to search, annotate, and interact with recordings. To support the researchers the optimal way, a user-centred approach was taken to develop the platform and related metadata scheme. Based on the requirements, a three level metadata scheme was developed: 1) core archival metadata, 2) LARM metadata, and 3) project-specific metadata. The paper analyses how researchers apply the metadata scheme in their research work. The purpose is to gain insight into broadcast researchers' tagging practice and motivation for tagging to inform future design of digital cultural heritage systems. The study consists of two studies, a) a qualitative study of subjects and vocabulary of the applied metadata and annotations, and b) five semi-structured interviews about goals for tagging. The findings clearly show that the primary role of LARM.fm is to provide access to broadcasts and provide tools to segment and manage concrete segments of radio broadcasts. Although the assigned metadata are project-specific, they have been applied to serve as invaluable access points for fellow researchers due to their factual and neutral nature. The researchers particularly stress LARM.fm's strength in providing streaming access to a large, shared corpus of broadcasts.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_43_2016_2_a.pdf.
Behandelte Form: AV-Materialien
Anwendungsfeld: Medienarchive (Rundfunk/Fernsehen)
3Golub, K. ; Lykke, M. ; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification.
In: Journal of documentation. 70(2014) no.5, S.801-828.
Abstract: Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
Themenfeld: Social tagging ; Automatisches Indexieren
4Svarre, T. ; Lykke, M.: Experiences with automated categorization in e-government information retrieval.
In: Knowledge organization. 41(2014) no.1, S.76-84.
Abstract: High-precision search results are essential for supporting e-government employees' information tasks. Prior studies have shown that existing features of e-government retrieval systems need improvement in terms of search facilities (e.g., Goh et al. 2008), navigation (e.g., de Jong and Lentz 2006) and metadata (e.g., Kopackova, Michalek and Cejna 2010). This paper investigates how automated categorization can enhance information organization and retrieval, and presents the results of a realistic evaluation that compared automated categorization with free text indexing of the government intranet used by Danish tax authorities. The evaluation demonstrates a potential for automated categorization in a government context. In terms of quantitative measures free text indexing performed at the same level or better than searching by categorization. However, the qualitative analysis revealed that categorized overviews were useful if the participant did not possess much knowledge of the task at hand. When task knowledge was present, categorization was used to support the assumptions of a correct search. Participants avoided automated categorization if high-precision documents were among the top results or if few documents were retrieved. The findings emphasise the importance of simultaneous search options for e-government IR systems, and reveal that automated categorization is valuable in improving search facilities in e-government.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_41_2014_1_g.pdf. ; Papers from the ISKO-UK Biennial Conference, "Knowledge Organization: Pushing the Boundaries," United Kingdom, 8-9 July, 2013, London.
5Lykke, M. ; Price, S. ; Delcambre, L.: How doctors search : a study of query behaviour and the impact on search results.
In: Information processing and management. 48(2012) no.6, S.1151-1170.
Abstract: Professional, workplace searching is different from general searching, because it is typically limited to specific facets and targeted to a single answer. We have developed the semantic component (SC) model, which is a search feature that allows searchers to structure and specify the search to context-specific aspects of the main topic of the documents. We have tested the model in an interactive searching study with family doctors with the purpose to explore doctors' querying behaviour, how they applied the means for specifying a search, and how these features contributed to the search outcome. In general, the doctors were capable of exploiting system features and search tactics during the searching. Most searchers produced well-structured queries that contained appropriate search facets. When searches failed it was not due to query structure or query length. Failures were mostly caused by the well-known vocabulary problem. The problem was exacerbated by using certain filters as Boolean filters. The best working queries were structured into 2-3 main facets out of 3-5 possible search facets, and expressed with terms reflecting the focal view of the search task. The findings at the same time support and extend previous results about query structure and exhaustivity showing the importance of selecting central search facets and express them from the perspective of search task. The SC model was applied in the highest performing queries except one. The findings suggest that the model might be a helpful feature to structure queries into central, appropriate facets, and in returning highly relevant documents.
Inhalt: Vgl.: doi: 10.1016/j.ipm.2012.02.006.
6Golub, K. ; Lykke, M.: Automated classification of web pages in hierarchical browsing.
In: Journal of documentation. 65(2009) no.6, S.901-925.
Abstract: Purpose - The purpose of this study is twofold: to investigate whether it is meaningful to use the Engineering Index (Ei) classification scheme for browsing, and then, if proven useful, to investigate the performance of an automated classification algorithm based on the Ei classification scheme. Design/methodology/approach - A user study was conducted in which users solved four controlled searching tasks. The users browsed the Ei classification scheme in order to examine the suitability of the classification systems for browsing. The classification algorithm was evaluated by the users who judged the correctness of the automatically assigned classes. Findings - The study showed that the Ei classification scheme is suited for browsing. Automatically assigned classes were on average partly correct, with some classes working better than others. Success of browsing showed to be correlated and dependent on classification correctness. Research limitations/implications - Further research should address problems of disparate evaluations of one and the same web page. Additional reasons behind browsing failures in the Ei classification scheme also need further investigation. Practical implications - Improvements for browsing were identified: describing class captions and/or listing their subclasses from start; allowing for searching for words from class captions with synonym search (easily provided for Ei since the classes are mapped to thesauri terms); when searching for class captions, returning the hierarchical tree expanded around the class in which caption the search term is found. The need for improvements of classification schemes was also indicated. Originality/value - A user-based evaluation of automated subject classification in the context of browsing has not been conducted before; hence the study also presents new findings concerning methodology.
Themenfeld: Automatisches Klassifizieren ; Klassifikationssysteme im Online-Retrieval
Objekt: Engineering Index Classification
7Lykke, M.: Networked Knowledge Organization Systems/Services (NKOS).
In: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates. London : Taylor & Francis, 2009. S.xx-xx.
Abstract: The NKOS Community is described in this entry. NKOS (http://nkos.slis.kent.edu/) is an informal network of academics and practitioners who are interested in the use of knowledge organization systems (KOS) in networked information environments. The general aim of the community is to enable KOS to act as networked information services (both machine-to-machine and human-computer), and support the description and retrieval of information resources on the Internet. The community is a forum for presentation and discussion of KOS applications, and interchange of ideas, from technical issues to intellectual, semantic, and terminological problems related to the use of KOS. The participants come from a variety of disciplines, and from academia as well as practice, and interact and communicate by a diverse set of means: annual workshops in the United States and Europe, a Web site, a mailing list, and publication of special journal issues and working papers about contemporary issues. The NKOS community represents topical diversity, informality, and multiple perspectives on networked KOS applications. There is an implicit danger that this variety diverts the focus and discussion. However, it appears from the analysis that the constancy in the organization of the activities and the well-established peer review process is adequate to attract participants, maintain satisfactory quality, and keep focus in multiplicity.
Anmerkung: Vgl.: http://www.tandfonline.com/doi/book/10.1081/E-ELIS3.
8Lykke, M. ; Price, S.L. ; Delcambre, L.M.L.: Using semantic components to represent and search domain-specific documents : an evaluation of indexing accuracy and consistency.
In: Paradigms and conceptual systems in knowledge organization: Proceedings of the Eleventh International ISKO conference, Rome, 23-26 February 2010, ed. Claudio Gnoli, Indeks, Frankfurt M. Würzburg : Ergon Verlag, S.276-282.
(Advances in knowledge organization; vol.12)
Abstract: We developed the semantic component (SC) model that supplements existing representations of document content by exploiting domain-specific characteristics of document types and content. The model allows users to express queries against domain-specific components of documents as well as against whole documents. We performed a comparative indexing study as an experimental case study using a national health portal as a framework in order to assess the feasibility of SC indexing. Feasibility includes determining indexing time, perceived difficulty of the indexing task, and indexing accuracy and consistency. The findings show that the feasibility of SC indexing appears to be in the same general range as for assigned indexing by use of indexing terms.