Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Lee, K. ; Kim, S.Y. ; Kim, E.H.-J. ; Song, M.: Comparative evaluation of bibliometric content networks by tomographic content analysis : an application to Parkinson's disease.
In: Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1295-1307.
Abstract: To understand the current state of a discipline and to discover new knowledge of a certain theme, one builds bibliometric content networks based on the present knowledge entities. However, such networks can vary according to the collection of data sets relevant to the theme by querying knowledge entities. In this study we classify three different bibliometric content networks. The primary bibliometric network is based on knowledge entities relevant to a keyword of the theme, the secondary network is based on entities associated with the lower concepts of the keyword, and the tertiary network is based on entities influenced by the theme. To explore the content and properties of these networks, we propose a tomographic content analysis that takes a slice-and-dice approach to analyzing the networks. Our findings indicate that the primary network is best suited to understanding the current knowledge on a certain topic, whereas the secondary network is good at discovering new knowledge across fields associated with the topic, and the tertiary network is appropriate for outlining the current knowledge of the topic and relevant studies.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23752/full.
2Song, M. ; Kim, S.Y. ; Lee, K.: Ensemble analysis of topical journal ranking in bioinformatics.
In: Journal of the Association for Information Science and Technology. 68(2017) no.6, S.1564-1583.
Abstract: Journal rankings, frequently determined by the journal impact factor or similar indices, are quantitative measures for evaluating a journal's performance in its discipline, which is presently a major research thrust in the bibliometrics field. Recently, text mining was adopted to augment journal ranking-based evaluation with the content analysis of a discipline taking a time-variant factor into consideration. However, previous studies focused mainly on a silo analysis of a discipline using either citation-or content-oriented approaches, and no attempt was made to analyze topical journal ranking and its change over time in a seamless and integrated manner. To address this issue, we propose a journal-time-topic model, an extension of Dirichlet multinomial regression, which we applied to the field of bioinformatics to understand journal contribution to topics in a field and the shift of topic trends. The journal-time-topic model allows us to identify which journals are the major leaders in what topics and the manner in which their topical focus. It also helps reveal an interesting distinct pattern in the journal impact factor of high- and low-ranked journals. The study results shed a new light for understanding topic specific journal rankings and shifts in journals' concentration on a subject.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23840/full.
3Kim, S. ; Ko, Y. ; Oard, D.W.: Combining lexical and statistical translation evidence for cross-language information retrieval.
In: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.23-39.
Abstract: This article explores how best to use lexical and statistical translation evidence together for cross-language information retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine-readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the adverse effect of translation ambiguity. Coverage statistics for NII Testbeds and Community for Information Access Research (NTCIR) queries confirm that these resources have complementary strengths. Experiments with translation evidence from a small parallel corpus indicate that even rather rough estimates of translation probabilities can yield further improvements over a strong technique for translation weighting based on using Jensen-Shannon divergence as a term-association measure. Finally, a novel approach to posttranslation query expansion using a random walk over the Wikipedia concept link graph is shown to yield further improvements over alternative techniques for posttranslation query expansion. Evaluation results on the NTCIR-5 English-Korean test collection show statistically significant improvements over strong baselines.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23153/abstract.
Themenfeld: Computerlinguistik ; Multilinguale Probleme
4Kim, S.U.: Exploring the knowledge development process of English language learners at a high school : how do English language proficiency and the nature of research task influence student learning?.
In: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.128-143.
Abstract: This study aims to understand the learning experience of English language learners (ELLs) within the framework of Kuhlthau's Information Search Process (ISP). Forty-eight ELL students from three classes at a high school participated in the study while they conducted a research project in English. Data were collected through demographic questionnaire and process surveys. Students' demographic information, knowledge about their research topic, labeling of knowledge, estimate of interest and knowledge, and learning outcomes were collected and analyzed with content analysis and statistical techniques. The findings indicate that ELL students, as a whole group, showed significant increases in their topical knowledge and estimate of interest and knowledge as they progressed in the research project, which are consistent with what other ISP-based studies found. When three different English proficiency-level groups were compared, only the intermediate group showed significant increases in topical knowledge and estimate of knowledge throughout the process. Also, different research tasks impacted the amount and substance of knowledge students built and their estimated knowledge during the research project. The findings led to suggestions for instructional strategies such as learning goals reflecting various kinds of learning, differentiated instructions in mixed-ability classrooms, structured interventions, personalized research topics, and teacher-school librarian collaborations.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23164/abstract.
5Song, M. ; Kim, S.Y. ; Zhang, G. ; Ding, Y. ; Chambers, T.: Productivity and influence in bioinformatics : a bibliometric analysis using PubMed central.
In: Journal of the Association for Information Science and Technology. 65(2014) no.2, S.352-371.
Abstract: Bioinformatics is a fast-growing field based on the optimal use of "big data" gathered in genomic, proteomics, and functional genomics research. In this paper, we conduct a comprehensive and in-depth bibliometric analysis of the field of bioinformatics by extracting citation data from PubMed Central full-text. Citation data for the period 2000 to 2011, comprising 20,869 papers with 546,245 citations, was used to evaluate the productivity and influence of this emerging field. Four measures were used to identify productivity; most productive authors, most productive countries, most productive organizations, and most popular subject terms. Research impact was analyzed based on the measures of most cited papers, most cited authors, emerging stars, and leading organizations. Results show the overall trends between the periods 2000 to 2003 and 2004 to 2007 were dissimilar, while trends between the periods 2004 to 2007 and 2008 to 2011 were similar. In addition, the field of bioinformatics has undergone a significant shift, co-evolving with other biomedical disciplines.
6Liu, W. ; Dog(an, R.I. ; Kim, S. ; Comeau, D.C. ; Kim, W. ; Yeganova, L. ; Lu, Z. ; Wilbur, W.J.: Author name disambiguation for PubMed.
In: Journal of the Association for Information Science and Technology. 65(2014) no.4, S.765-781.
Abstract: Log analysis shows that PubMed users frequently use author names in queries for retrieving scientific literature. However, author name ambiguity may lead to irrelevant retrieval results. To improve the PubMed user experience with author name queries, we designed an author name disambiguation system consisting of similarity estimation and agglomerative clustering. A machine-learning method was employed to score the features for disambiguating a pair of papers with ambiguous names. These features enable the computation of pairwise similarity scores to estimate the probability of a pair of papers belonging to the same author, which drives an agglomerative clustering algorithm regulated by 2 factors: name compatibility and probability level. With transitivity violation correction, high precision author clustering is achieved by focusing on minimizing false-positive pairing. Disambiguation performance is evaluated with manual verification of random samples of pairs from clustering results. When compared with a state-of-the-art system, our evaluation shows that among all the pairs the lumping error rate drops from 10.1% to 2.2% for our system, while the splitting error rises from 1.8% to 7.7%. This results in an overall error rate of 9.9%, compared with 11.9% for the state-of-the-art method. Other evaluations based on gold standard data also show the increase in accuracy of our clustering. We attribute the performance improvement to the machine-learning method driven by a large-scale training set and the clustering algorithm regulated by a name compatibility scheme preferring precision. With integration of the author name disambiguation system into the PubMed search engine, the overall click-through-rate of PubMed users on author name query results improved from 34.9% to 36.9%.
7Szpakowicz, S. ; Bond, F. ; Nakov, P. ; Kim, S.N.: On the semantics of noun compounds.
In: Natural language engineering. 2013, April, S.1-2.
Abstract: The noun compound - a sequence of nouns which functions as a single noun - is very common in English texts. No language processing system should ignore expressions like steel soup pot cover if it wants to be serious about such high-end applications of computational linguistics as question answering, information extraction, text summarization, machine translation - the list goes on. Processing noun compounds, however, is far from trouble-free. For one thing, they can be bracketed in various ways: is it steel soup, steel pot, or steel cover? Then there are relations inside a compound, annoyingly not signalled by any words: does pot contain soup or is it for cooking soup? These and many other research challenges are the subject of this special issue.
Inhalt: Vgl.: http://journals.cambridge.org/article_S1351324913000090.
8Kim, S. ; Cho, S.: Characteristics of Korean personal names.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.1, S.86-95.
Abstract: Korea, along with Asia at large, is producing more and more valuable academic materials. Furthermore, the demand for academic materials produced in non-Western societies is increasing among English-speaking users. In order to search among such material, users rely on keywords such as author names. However, Asian nations such as Korea and China have markedly different methods of writing personal names from Western naming traditions. Among these differences are name components, structure, writing customs, and distribution of surnames. These differences influence the Anglicization of Asian academic researchers' names, often leading to them being written in various fashions, unlike Western personal names. These inconsistent formats can often lead to difficulties in searching and finding academic materials for Western users unfamiliar with Korean and Asian personal names. This article presents methods for precisely understanding and categorizing Korean personal names in order to make academic materials by Korean authors easier to find for Westerners. As such, this article discusses characteristics particular to Korean personal names and furthermore analyzes how the personal names of Korean academic researchers are currently being written in English.
9Kim, S. ; Oh, S.: Users' relevance criteria for evaluating answers in a social Q&A site.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.4, S.716-727.
Abstract: This study examines the criteria questioners use to select the best answers in a social Q&A site (Yahoo! Answers) within the theoretical framework of relevance research. A social Q&A site is a novel environment where people voluntarily ask and answer questions. In Yahoo! Answers, the questioner selects the answer that best satisfies his or her question and leaves comments on it. Under the assumption that the comments reflect the reasons why questioners select particular answers as the best, this study analyzed 2,140 comments collected from Yahoo! Answers during December 2007. The content analysis identified 23 individual relevance criteria in six classes: Content, Cognitive, Utility, Information Sources, Extrinsic, and Socioemotional. A major finding is that the selection criteria used in a social Q&A site have considerable overlap with many relevance criteria uncovered in previous relevance studies, but that the scope of socio-emotional criteria has been expanded to include the social aspect of this environment. Another significant finding is that the relative importance of individual criteria varies according to topic categories. Socioemotional criteria are popular in discussion-oriented categories, content-oriented criteria in topic-oriented categories, and utility criteria in self-help categories. This study generalizes previous relevance studies to a new environment by going beyond an academic setting.
Objekt: Yahoo! Answers
10Chung, D.S. ; Kim, S.: Blogging activity among cancer patients and their companions : uses, gratifications, and predictors of outcomes.
In: Journal of the American Society for Information Science and Technology. 59(2008) no.2, S.297-306.
Abstract: This study examines cancer patients' and companions' uses and gratifications of blogs and the relationship between different types of blogging activities and gratification outcomes. In an online survey of 113 respondents, cancer patients were found to be more likely than their companions to host their own blogs. Four areas emerged as gratifications of blog use: prevention and care, problem-solving, emotion management, and information-sharing. Cancer patients and companions both found blogging activity to be most helpful for emotion management and information-sharing. Further, cancer patients were more gratified than their companions in the areas of emotion management and problem-solving. Regression analyses indicate that perceived credibility of blogs, posting comments on others' blogs, and hosting one's own blog significantly increased the explanatory power of the regression models for each gratification outcome.
11Kim, S. ; Rasmussen, E.: Characteristics of tissue-centric biomedical researchers using a survey and cluster analysis.
In: Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1210-1223.
Abstract: The objective of this study was to characterize the types of tissue-centric users based on tissue use, requirements, and their job or work-related variables at the University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA. A self-reporting questionnaire was distributed to biomedical researchers at the UPMC. Descriptive and cluster analyses were performed to identify and characterize the complex types of tissue-based researchers. A total of 62 respondents completed the survey, and two clusters were identified based on all variables. Two distinct groups of tissue-centric users made direct use of tissue samples for their research as well as associated information, while a third group of indirect users required only the associated information. The study shows that tissue-centric users were composed of various types. These types were distinguished in terms of tissue use and data requirements, as well as by their work or research-related activities.
12Son, H.-J. ; Kim, S.-H. ; Kim, J.-S.: Text image matching without language model using a Hausdorff distance.
In: Information processing and management. 44(2008) no.3, S.1189-1200.
Abstract: In this paper, we propose a text matching method for document image retrieval without any language model. Two word images are first normalized to an appropriate size and image features are extracted using the local crowdedness method. Similarity between the two features is then measured by calculating a Hausdorff distance. We performed three experiments. The first experiment proves the effectiveness of the proposed method for text matching, and the other two experiments verify the language independence and font size independence of the proposed method.
13Kim, K.-S. ; Kim, S.-C.J. ; Park, S.-J. ; Zhu, X. ; Polparsi, J.: Facet analyses of categories used in Web directories : a comparative study.
Abstract: Faceted classification is believed to be suitable for organizing digital information resources. Based on a faceted classification model suggested for Web resources (Zins, 2002), the current study analyzed popular Web directories from different Asian countries/areas and examined cultural differences reflected in their classification systems. Three popular Web directories from four countries/regions (China, Hong Kong, Korea, and Thailand) were selected and their classifications were analyzed and compared: a local Yahoo and two home-grown Web directories from each country/region. Based on the findings, the study suggests a model that might be more suitable to Asian culture.
Inhalt: Vortrag anlässlich der 72ND IFLA General Conference and Council, 20-24 August 2006, Seoul, Korea
14Kim, S.K.: Romanization in cataloging of Korean materials.
In: Cataloging and classification quarterly. 43(2006) no.2, S.53-76.
Abstract: This paper analyzes cataloging rules for Korean materials focusing on the McCune-Reischauer (MR)1 system, the Korean romanization scheme currently used in the United States. This system has been used for a long time in many Western countries, and was officially adopted by the Library of Congress (LC) for use in the cataloging of Korean language materials. Considering users' information-seeking behavior and searching abilities, however, the MR system has many drawbacks for increasing users' ability to retrieve information. This paper analyzes bibliographic records in academic libraries, the LC, and the Research Libraries Information Network (RLIN) to identify the issues and problems of the MR system. A user survey conducted demonstrates that the MR system is not a user-customized tool based on users' searching ability. Several solutions are suggested to overcome the limitations of the MR system.
Inhalt: Vgl. auch: http://catalogingandclassificationquarterly.com/
15Seo, H.-C. ; Kim, S.-B. ; Rim, H.-C. ; Myaeng, S.-H.: lmproving query translation in English-Korean Cross-language information retrieval.
In: Information processing and management. 41(2005) no.3, S.507-522.
Abstract: Query translation is a viable method for cross-language information retrieval (CLIR), but it suffers from translation ambiguities caused by multiple translations of individual query terms. Previous research has employed various methods for disambiguation, including the method of selecting an individual target query term from multiple candidates by comparing their statistical associations with the candidate translations of other query terms. This paper proposes a new method where we examine all combinations of target query term translations corresponding to the source query terms, instead of looking at the candidates for each query term and selecting the best one at a time. The goodness value for a combination of target query terms is computed based on the association value between each pair of the terms in the combination. We tested our method using the NTCIR-3 English-Korean CLIR test collection. The results show some improvements regardless of the association measures we used.
Themenfeld: Multilinguale Probleme
16Hara, N. ; Solomon, P. ; Kim, S.-L. ; Sonnenwald, D.H.: ¬An emerging view of scientific collaboration : scientists' perspectives on collaboration and factors that impact collaboration.
In: Journal of the American Society for Information Science and technology. 54(2003) no.10, S.952-965.
Abstract: Collaboration is often a critical aspect of scientific research, which is dominated by complex problems, rapidly changing technology, dynamic growth of knowledge, and highly specialized areas of expertise. An individual scientist can seldom provide all of the expertise and resources necessary to address complex research problems. This paper describes collaboration among a group of scientists, and considers how their experiences are socially shaped. The scientists were members of a newly formed distributed, multi-disciplinary academic research center that was organized into four multi-disciplinary research groups. Each group had 14 to 34 members, including faculty, postdoctoral fellows and students, at four geographically dispersed universities. To investigate challenges that emerge in establishing scientific collaboration, data were collected about members' previous and current collaborative experiences, perceptions regarding collaboration, and work practices during the center's first year of operation. The data for the study includes interviews with members of the Center, observations of videoconferences and meetings, and a Center-wide sociometric survey. Data analysis has led to the development of a framework that identifies forms of collaboration that emerged among scientists (e.g., complementary and integrative collaboration) and associated factors, which influenced collaboration including personal compatibility, work connections, incentives, and infrastructure. These results may inform the specification of social and organizational practices, which are needed to establish collaboration in distributed, multi-disciplinary research centers.
17Kim, S.H. ; Eastman, C.M.: ¬An experiment on node size in a hypermedia system.
In: Journal of the American Society for Information Science. 50(1999) no.6, S.530-536.
Abstract: The node size that should be used in a hypermedia system is an important design issue. 3 interpretations of node size are identified: storage (physical size), window size (presentation size), and length (logical size). an experiment in which presentation size and text length are varied in a HyperCard application is described. The experiment involves student subjects performing a fact retrieval task from a reference handbook. No interaction is found between these 2 independent variables. Performance is significantly better for the longer texts, but no significant difference is found for the 2 different window sizes
18Park, K.S. ; Kim, S.H.: Fuzzy cognitive maps considering time relationships.
In: International journal of human-computer studies. 42(1995) no.2, S.157-168.
Abstract: Casual knowledge is often cyclic and fuzzy, thus it is hard to represent in the form of trees. A fuzzy cognitive map (FCM) can represent casual knowledge as a sogned directed graph with feedback. Provides an intuitive framework in which to form decision problems as perceived by decision makers and to incorporate the knowledge of experts. Proposes a fuzzy time cognitive map (FTCM), which is a FCM introduced to a time relationship on arrows. Discusses the characteristics and basic assumptions of the FCM, and present a description of casual propagation in a FCM with the causalities of negative positive neutral interval [-1,1]. Develops a method of translating the FTCM, that has a different time lag, into the FTCM that has 1 or the same unit time lag, which is a value preserving translation. With the FTCM, illustrates analyzing the change of causalities among factors according to lapse of time
19Cortez, E.M. ; Park, S.C. ; Kim, S.: ¬The hybrid application of an inductive learning method and a neural network.
In: Information processing and management. 31(1995) no.6, S.789-813.
Abstract: Traditional information retrieval systems based on Boolean logic suffer from 2 inherent problems: (1) inaccurate or incomplete query representation, and (2) inconsistent indexing. While many researchers have demonstrated that neural networks can solve the incomplete query problems for information retrieval, the inconsistent indexing problem still remains unsolved. In this paper, we present a hybrid methodology of integrating an indictive learning technique with a neural network (connectionist model) in order to solve both inconsistent indexing and incomplete query problems. Since an inductive learning technique has the ability to identify the most significant document index terms with various levels of relationship to their semantic significance, it provides a possible solution to the problem on inconsitent indexing. This paper reports the first phase of research that demonstrates how a neural network augmented by an inductive learning technique results in effective information retrieval performance in the ares that demand flexible inferencing and reasoning when incomplete queries and inconsistent problems are present
20Cortex, E.M. ; Park, S.C. ; Kim, S.: ¬The hybrid application of an inductive learning method and a neural network for intelligent information retrieval.
In: Information processing and management. 31(1995) no.6, S.789-813.
Abstract: Traditional information retrieval systems based on Boolean logic suffer from 2 inherent problems: inaccurate of incomplete query representation, and inconsistent indexing. While many researchers have demonstrated that neural networks can solve the incomplete query problems for information retrieval, the inconsistent indexing problem still remains unsolved. Presents a hybrid methodology of integrating an inductive learning technique with a neural network (connectionist model) in order to solve both inconsistent indexing and incomplete query problems. Since an inductive learning technique has the ability to identify the most significant document index terms with various level of relationship to their semantic significance, it provides a possible solution to the problem of inconsistent indexing. Reports the 1st phase of research that demonstrates how a neural network augmented by an inductive learning technique results in effective information retrieval performance in the areas that demand flexible inferencing and reasoning when incomplete queries and inconsistent indexing problems are present