Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 03. März 2020)
1Lee, D.: Hornbostel-Sachs Classification of Musical Instruments.
In: Knowledge organization. 47(2020) no.1, S.72-91.
Abstract: This paper discusses the Hornbostel-Sachs Classification of Musical Instruments. This classification system was originally designed for musical instruments and books about instruments, and was first published in German in 1914. Hornbostel-Sachs has dominated organological discourse and practice since its creation, and this article analyses the scheme's context, background, versions and impact. The position of Hornbostel-Sachs in the history and development of instrument classification is explored. This is followed by a detailed analysis of the mechanics of the scheme, including its decimal notation, the influential broad categories of the scheme, its warrant and its typographical layout. The version history of the scheme is outlined and the relationships between versions is visualised, including its translations, the introduction of the electrophones category and the Musical Instruments Museums Online (MIMO) version designed for a digital environment. The reception of Hornbostel-Sachs is analysed, and its usage, criticism and impact are all considered. As well as dominating organological research and practice for over a century, it is shown that Hornbostel-Sachs also had a significant influence on the bibliographic classification of music.
Anmerkung: Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization Version 1.1 (= 1.0 plus details on electrophones and Wikipedia); version 1.0 published 2019-01-17, this version 2019-05-29. Article category: KOS, specific (domain specific). The author would like to thank the anonymous reviewers for their useful comments, as well as the editor, Professor Birger Hjørland, for all of his insightful comments and ideas.
Objekt: Hornbostel-Sachs Classification of Musical Instruments
2Lee, D.H. ; Brusilovsky, P.: ¬The first impression of conference papers : does it matter in predicting future citations?.
In: Journal of the Association for Information Science and Technology. 70(2019) no.1, S.83-95.
Abstract: This article explores the factors influencing the future citations of conference papers. We concentrated on the explanatory power of early attention on conference papers for citations collected from Google Scholar and Scopus. The early attention data includes users' online activities in a conference support system: CN3. Bookmarks from the bibliographic management system, Citeulike, were used as a collateral source of early attention. To examine the chronological contributions of 13 factors on citations, a multiple sequential regression analysis was conducted for three timepoints of the publication cycle-paper submission, time of conferences, and months after conferences. Our results illustrate that online readers' early attention of Citeulike bookmarks were found to have the most influence on the future impact of the conference papers. The early attention records from CN3 made noteworthy improvements to explaining both the Google and Scopus citations as well. We also found that the type of papers the number of papers presented at a conference, and the best article award records were significant factors influencing future citations. However, the magnitude of the effects made by online readers' early attention from both sources appears to be larger than these three traditional factors.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24100.
3Lee, D. ; Robinson, L. ; Bawden, D.: Modeling the relationship between scientific and bibliographic classification for music.
In: Journal of the Association for Information Science and Technology. 70(2019) no.3, S.230-241.
Abstract: Scientific classification is an important topic in contemporary knowledge organization discourse, yet the nature of the relationships between scientific and bibliographic classifications has not been fully studied. This article considers the connections between scientific and bibliographic classifications for music, taking general discourse about scientific classification and domain analysis as its starting point. Three relationship characteristics are posited: similarity, causation, and time. In discussions about similarity, "accords" and "discords" are analyzed. Further, the idea of a scale of accord is introduced, and issues with assuming a univocal scientific or bibliographic classification of music are discussed. Causation and the idea of influence between scientific and bibliographic classifications for music are unpicked. The connections between accordance and influence are explored, and the concept of differing purposes for different classification approaches is analyzed. A temporal dimension is considered, and the dynamic nature of connections between music scientific and bibliographic classifications is established. The idea of bifurcation is introduced-a change of accordance over time-which is prominent for musical instrument classification. The concluding model visualizes similarity, causation and temporal aspects as three dimensions, showing how scientific and bibliographic classifications for music are connected through a set of interconnected and complex relationships.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24120.
4Lee, D. ; Robinson, L.: ¬The heart of music classification : toward a model of classifying musical medium.
In: Journal of documentation. 74(2018) no.2, S.258-277.
Abstract: Purpose The purpose of this paper is to understand the classification of musical medium, which is a critical part of music classification. It considers how musical medium is currently classified, provides a theoretical understanding of what is currently problematic, and proposes a model which rethinks the classification of medium and resolves these issues. Design/methodology/approach The analysis is drawn from existing classification schemes, additionally using musicological and knowledge organization literature where relevant. The paper culminates in the design of a model of musical medium. Findings The analysis elicits sub-facets, orders and categorizations of medium: there is a strict categorization between vocal and instrumental music, a categorization based on broad size, and important sub-facets for multiples, accompaniment and arrangement. Problematically, there is a mismatch between the definitiveness of library and information science vocal/instrumental categorization and the blurred nature of real musical works; arrangements and accompaniments are limited by other categorizations; multiple voices and groups are not accommodated. So, a model with a radical new structure is proposed which resolves these classification issues. Research limitations/implications The results could be used to further understanding of music classification generally, for Western art music and other types of music. Practical implications The resulting model could be used to improve and design new classification schemes and to improve understanding of music retrieval. Originality/value Deep theoretical analysis of music classification is rare, so this paper's approach is original. Furthermore, the paper's value lies in studying a vital area of music classification which is not currently understood, and providing explanations and solutions. The proposed model is novel in structure and concept, and its original structure could be adapted for other knotty subjects.
Inhalt: Vgl.: https://www.emeraldinsight.com/doi/full/10.1108/JD-08-2017-0120.
5Stvilia, B. ; Wu, S. ; Lee, D.J.: Researchers' uses of and disincentives for sharing their research identity information in research information management systems.
In: Journal of the Association for Information Science and Technology. 69(2018) no.8, S.1035-1045.
Abstract: This study examined how researchers used research information systems (RIMSs) and the relationships among researchers' seniority, discipline, and types and extent of RIMS use. Most researchers used RIMSs to discover research content. Fewer used RIMSs for sharing and promoting their research. Early career researchers were more frequent users of RIMSs than were associate and full professors. Likewise, assistant professors and postdocs exhibited a higher probability of using RIMSs to promote their research than did students and full professors. Humanities researchers were the least frequent users of RIMSs. Moreover, humanities scholars used RIMSs to evaluate research less than did scholars in other disciplines. The tasks of discovering papers, monitoring the literature, identifying potential collaborators, and promoting research were predictors of higher RIMS use. Researchers who engaged in promoting their research, evaluating research, or monitoring the literature showed a greater propensity to have a public RIMS profile. Furthermore, researchers mostly agreed that not being required, having no effect on their status, not being useful, or not being a norm were reasons for not having a public RIMS profile. Humanities scholars were also more likely than social scientists to agree that having a RIMS profile was not a norm in their fields.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.24019.
6Lee, D. ; Robinson, L. ; Bawden, D.: Global knowledge organization, "super-facets" and music : universal music classification in the digital age.
In: Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira. Baden-Baden : Ergon Verlag, 2018. S.248-255.
(Advances in knowledge organization; vol.16)
7Lee, S. ; Ha, T. ; Lee, D. ; Kim, J.H.: Understanding the majority opinion formation process in online environments : an exploratory approach to Facebook.
In: Information processing and management. 54(2018) no.6, S.1115-1128.
Abstract: Majority opinions are often observed in the process of social interaction in online communities, but few studies have addressed this issue with empirical data. To identify an appropriate theoretical lens for explaining majority opinions in online environments, this study investigates the skewness statistic, which indicates how many "Likes" are skewed to major comments on a Facebook post; 3489 posts are gathered from the New York Times Facebook page for 100 days. Results show that time is not an influential factor for skewness increase, but the number of comments has a logarithmic relation to skewness increase. Regression models and Chow tests show that this relationship differs depending on topic contents, but majority opinions are significant in overall. These results suggest that the bandwagon effect due to social affordance can be a suitable mechanism for explaining majority opinion formation in an online environment and that majority opinions in online communities can be misperceived due to overestimation.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2018.08.002.
8Lee, D.: Numbers, instruments and hands : the impact of faceted analytical theory on classifying music ensembles.
In: Knowledge organization. 44(2017) no.6, S.405-415.
Abstract: This article considers a particularly knotty aspect of classifying notated music: the classification of instrumental ensembles, where the term ensembles is defined as music written for multiple players with only one player per part. Facet analysis is used to examine this area of music classification and as the basis of a model for classifying ensembles. The conceptual analysis is aided by examples drawn from two classification schemes: British Catalogue of Music Classification (BCMC) and Flexible Classification. First, this exploration reveals that there are conceptually four sub-facets for classifying instrument ensembles, and that the omission of any of these sub-facets causes issues within classification schemes. Next, the different type of relationships between pairs of these sub-facets is delineated, including hierarchical and associative relationships. The classification of ensembles is depicted in a novel way, as a series of inter-connected relationships between sub-facets. Finally, the article ascertains exactly what is being counted, including introducing potential extra sets of sub-facets pertaining to performers and hands. So, facet analysis helps to create a model for classifying instrumental ensembles which provides a novel solution to this historically problematic area of music classification, as well as suggesting a potentially generalizable new way of thinking about complex relationships between sub-facets.
Inhalt: Beitrag in einem Special Issue: Selected Papers from the International UDC Seminar 2017, Faceted Classification Today: Theory, Technology and End Users, 14-15 September, London UK.
9Stvilia, B. ; Hinnant, C.C. ; Wu, S. ; Worrall, A. ; Lee, D.J. ; Burnett, K. ; Burnett, G. ; Kazmer, M.M. ; Marty, P.F.: Research project tasks, data, and perceptions of data quality in a condensed matter physics community.
In: Journal of the Association for Information Science and Technology. 66(2015) no.2, S.246-263.
Abstract: To be effective and at the same time sustainable, a community data curation model needs to be aligned with the community's current data practices, including research project activities, data types, and perceptions of data quality. Based on a survey of members of the condensed matter physics (CMP) community gathered around the National High Magnetic Field Laboratory, a large national laboratory, this article defines a model of CMP research project tasks consisting of 10 task constructs. In addition, the study develops a model of data quality perceptions by CMP scientists consisting of four data quality constructs. The paper also discusses relationships among the data quality perceptions, project roles, and demographic characteristics of CMP scientists. The findings of the study can inform the design of a CMP data curation model that is aligned and harmonized with the community's research work structure and data practices.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23177/abstract.
10Lee, D.: Webs of "Wirkung" : modelling the interconnectedness of classification schemes.
In: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik. Würzburg : Ergon Verlag, 2014. S.200-207.
(Advances in knowledge organization; vol. 14)
Abstract: This paper explores relationships between different classification schemes. It suggests how these relationships could be considered part of the reception of a scheme, in particular as an aspect of its "Wirkung". Both intra-domain and inter-domain scheme relationships are examined, and are combined with pre-existing research on intra-scheme relationships. A model is posited which maps inter-scheme relationships, showing some of the complexities evoked in analysing the connections between classification schemes. Musical instrument (organology) classification is used as examples throughout the paper, to illustrate the ideas being discussed.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/aiko_vol_14_2014_28.pdf.
11Lee, D.J.L. ; Stvilia, B.: Developing a data identifier taxonomy.
In: Cataloging and classification quarterly. 52(2014) no.3, S.303-336.
Abstract: As the amount of research data management is growing, the use of identity metadata for discovering, linking, and citing research data is growing too. To support the awareness of different identifier systems and the comparison and selection of an identifier for a particular data management environment, there is need for a knowledge base. This article contributes to that goal and analyzes the data management and related literatures to develop a data identifier taxonomy. The taxonomy includes four categories (domain, entity types, activities, and quality dimensions). In addition, the article describes 14 identifiers referenced in the literature and analyzes them along the taxonomy.
12Lee, D.H. ; Schleyer, T.: Social tagging is no substitute for controlled indexing : a comparison of Medical Subject Headings and CiteULike tags assigned to 231,388 papers.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.9, S.1747-1757.
Abstract: Social tagging and controlled indexing both facilitate access to information resources. Given the increasing popularity of social tagging and the limitations of controlled indexing (primarily cost and scalability), it is reasonable to investigate to what degree social tagging could substitute for controlled indexing. In this study, we compared CiteULike tags to Medical Subject Headings (MeSH) terms for 231,388 citations indexed in MEDLINE. In addition to descriptive analyses of the data sets, we present a paper-by-paper analysis of tags and MeSH terms: the number of common annotations, Jaccard similarity, and coverage ratio. In the analysis, we apply three increasingly progressive levels of text processing, ranging from normalization to stemming, to reduce the impact of lexical differences. Annotations of our corpus consisted of over 76,968 distinct tags and 21,129 distinct MeSH terms. The top 20 tags/MeSH terms showed little direct overlap. On a paper-by-paper basis, the number of common annotations ranged from 0.29 to 0.5 and the Jaccard similarity from 2.12% to 3.3% using increased levels of text processing. At most, 77,834 citations (33.6%) shared at least one annotation. Our results show that CiteULike tags and MeSH terms are quite distinct lexically, reflecting different viewpoints/processes between social tagging and controlled indexing.
Themenfeld: Indexierungsstudien ; Social tagging
Objekt: MeSH ; CiteULike ; MEDLINE
13Lee, D.: Classifying musical performance : the application of classification theories to concert programmes.
In: Knowledge organization. 38(2011) no.6, S.530-540.
Abstract: This paper demonstrates how knowledge organisation theories can be used to understand the arrangement of concert programmes. Key classification theories from the management of libraries, archives and ephemera collections are used as a framework in this study: characteristics of division (faceted classification theory), provenance (archival arrangement) and arrangement by format (ephemera arrangement). Each theory is used to analyse the arrangement of specific concert programme collections held at the Centre for Performance History, Royal College of Music, London. Two classification models are created from the analysis. Model 1 reveals how concert programme arrangement could be viewed as a theoretical bridge between bibliographic, archival and ephemera arrangement theories. This model proposes a unified classification based on bibliographic characteristics of division; the characteristics of division structure is populated with characteristics taken from bibliographical classification, archival arrangement and ephemera organisation. Model 2 proposes an alternative way of considering the unified classification model: a triumvirate of event, programme and individual copy. Complex relationships between elements of the triumvirate are explored, as well as is an analysis of how various characteristics fit into the model.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_38_2011_6_f.pdf.
14Dang, E.K.F. ; Luk, R.W.P. ; Allan, J. ; Ho, K.S. ; Chung, K.F.L. ; Lee, D.L.: ¬A new context-dependent term weight computed by boost and discount using relevance information.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2514-2530.
Abstract: We studied the effectiveness of a new class of context-dependent term weights for information retrieval. Unlike the traditional term frequency-inverse document frequency (TF-IDF), the new weighting of a term t in a document d depends not only on the occurrence statistics of t alone but also on the terms found within a text window (or "document-context") centered on t. We introduce a Boost and Discount (B&D) procedure which utilizes partial relevance information to compute the context-dependent term weights of query terms according to a logistic regression model. We investigate the effectiveness of the new term weights compared with the context-independent BM25 weights in the setting of relevance feedback. We performed experiments with title queries of the TREC-6, -7, -8, and 2005 collections, comparing the residual Mean Average Precision (MAP) measures obtained using B&D term weights and those obtained by a baseline using BM25 weights. Given either 10 or 20 relevance judgments of the top retrieved documents, using the new term weights yields improvement over the baseline for all collections tested. The MAP obtained with the new weights has relative improvement over the baseline by 3.3 to 15.2%, with statistical significance at the 95% confidence level across all four collections.
15Nah, I.W. ; Kang, D.-S. ; Lee, D.-H. ; Chung, Y.-C.: ¬A bibliometric evaluation of research performance in different subject categories.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.6, S.1138-1143.
Abstract: In this article, bibliometric indicators with publications and citations are used for a direct research-performance comparison among different or interdisciplinary categories, the work of individual scientists, and their research teams and institutions. For example, basic research performances of some projects at the Korea Institute of Science and Technology (KIST) were assessed using bibliographic factors with IPQ-Normalized impact factor to compare with an international level and other research groups in different or interdisciplinary fields. Some research teams at KIST showed higher quality publications in terms of the international measures.
16Li, D. ; Kwong, C.-P. ; Lee, D.L.: Unified linear subspace approach to semantic analysis.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.1, S.175-189.
Abstract: The Basic Vector Space Model (BVSM) is well known in information retrieval. Unfortunately, its retrieval effectiveness is limited because it is based on literal term matching. The Generalized Vector Space Model (GVSM) and Latent Semantic Indexing (LSI) are two prominent semantic retrieval methods, both of which assume there is some underlying latent semantic structure in a dataset that can be used to improve retrieval performance. However, while this structure may be derived from both the term space and the document space, GVSM exploits only the former and LSI the latter. In this article, the latent semantic structure of a dataset is examined from a dual perspective; namely, we consider the term space and the document space simultaneously. This new viewpoint has a natural connection to the notion of kernels. Specifically, a unified kernel function can be derived for a class of vector space models. The dual perspective provides a deeper understanding of the semantic space and makes transparent the geometrical meaning of the unified kernel function. New semantic analysis methods based on the unified kernel function are developed, which combine the advantages of LSI and GVSM. We also prove that the new methods are stable because although the selected rank of the truncated Singular Value Decomposition (SVD) is far from the optimum, the retrieval performance will not be degraded significantly. Experiments performed on standard test collections show that our methods are promising.
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval
Objekt: Latent Semantic Indexing ; Generalized Vector Space Model
17Dang, E.K.F. ; Luk, R.W.P. ; Ho, K.S. ; Chan, S.C.F. ; Lee, D.L.: ¬A new measure of clustering effectiveness : algorithms and experimental studies.
In: Journal of the American Society for Information Science and Technology. 59(2008) no.3, S.390-406.
Abstract: We propose a new optimal clustering effectiveness measure, called CS1, based on a combination of clusters rather than selecting a single optimal cluster as in the traditional MK1 measure. For hierarchical clustering, we present an algorithm to compute CS1, defined by seeking the optimal combinations of disjoint clusters obtained by cutting the hierarchical structure at a certain similarity level. By reformulating the optimization to a 0-1 linear fractional programming problem, we demonstrate that an exact solution can be obtained by a linear time algorithm. We further discuss how our approach can be generalized to more general problems involving overlapping clusters, and we show how optimal estimates can be obtained by greedy algorithms.
Themenfeld: Automatisches Klassifizieren
18Wong, W.S. ; Luk, R.W.P. ; Leong, H.V. ; Ho, K.S. ; Lee, D.L.: Re-examining the effects of adding relevance information in a relevance feedback environment.
In: Information processing and management. 44(2008) no.3, S.1086-1116.
Abstract: This paper presents an investigation about how to automatically formulate effective queries using full or partial relevance information (i.e., the terms that are in relevant documents) in the context of relevance feedback (RF). The effects of adding relevance information in the RF environment are studied via controlled experiments. The conditions of these controlled experiments are formalized into a set of assumptions that form the framework of our study. This framework is called idealized relevance feedback (IRF) framework. In our IRF settings, we confirm the previous findings of relevance feedback studies. In addition, our experiments show that better retrieval effectiveness can be obtained when (i) we normalize the term weights by their ranks, (ii) we select weighted terms in the top K retrieved documents, (iii) we include terms in the initial title queries, and (iv) we use the best query sizes for each topic instead of the average best query size where they produce at most five percentage points improvement in the mean average precision (MAP) value. We have also achieved a new level of retrieval effectiveness which is about 55-60% MAP instead of 40+% in the previous findings. This new level of retrieval effectiveness was found to be similar to a level using a TREC ad hoc test collection that is about double the number of documents in the TREC-3 test collection used in previous works.
19Bird, S. ; Dale, R. ; Dorr, B. ; Gibson, B. ; Joseph, M. ; Kan, M.-Y. ; Lee, D. ; Powley, B. ; Radev, D. ; Tan, Y.F.: ¬The ACL Anthology Reference Corpus : a reference dataset for bibliographic research in computational linguistics.
In: Proceedings of Language Resources and Evaluation Conference (LREC 08). Marrakesh, Morocco, May [http://acl-arc.comp.nus.edu.sg/lrec08.pdf].
Abstract: The ACL Anthology is a digital archive of conference and journal papers in natural language processing and computational linguistics. Its primary purpose is to serve as a reference repository of research results, but we believe that it can also be an object of study and a platform for research in its own right. We describe an enriched and standardized reference corpus derived from the ACL Anthology that can be used for research in scholarly document processing. This corpus, which we call the ACL Anthology Reference Corpus (ACL ARC), brings together the recent activities of a number of research groups around the world. Our goal is to make the corpus widely available, and to encourage other researchers to use it as a standard testbed for experiments in both bibliographic and bibliometric research.
Inhalt: Vgl. zum Corpus unter: http://acl-arc.comp.nus.edu.sg/. ; Vgl. auch: Automatic Term Recognition (ATR) is a research task that deals with the identification of domain-specific terms. Terms, in simple words, are textual realization of significant concepts in an expertise domain. Additionally, domain-specific terms may be classified into a number of categories, in which each category represents a significant concept. A term classification task is often defined on top of an ATR procedure to perform such categorization. For instance, in the biomedical domain, terms can be classified as drugs, proteins, and genes. This is a reference dataset for terminology extraction and classification research in computational linguistics. It is a set of manually annotated terms in English language that are extracted from the ACL Anthology Reference Corpus (ACL ARC). The ACL ARC is a canonicalised and frozen subset of scientific publications in the domain of Human Language Technologies (HLT). It consists of 10,921 articles from 1965 to 2006. The dataset, called ACL RD-TEC, is comprised of more than 69,000 candidate terms that are manually annotated as valid and invalid terms. Furthermore, valid terms are classified as technology and non-technology terms. Technology terms refer to a method, process, or in general a technological concept in the domain of HLT, e.g. machine translation, word sense disambiguation, and language modelling. On the other hand, non-technology terms refer to important concepts other than technological; examples of such terms in the domain of HLT are multilingual lexicon, corpora, word sense, and language model. The dataset is created to serve as a gold standard for the comparison of the algorithms of term recognition and classification. [http://catalog.elra.info/product_info.php?products_id=1236].
Objekt: ACL Anthology Reference Corpus