  1. Yi, K.: Harnessing collective intelligence in social tagging using Delicious (2012) 0.01
    A new collaborative approach in information organization and sharing has recently arisen, known as collaborative tagging or social indexing. A key element of collaborative tagging is the concept of collective intelligence (CI), which is a shared intelligence among all participants. This research investigates the phenomenon of social tagging in the context of CI with the aim to serve as a stepping-stone towards the mining of truly valuable social tags for web resources. This study focuses on assessing and evaluating the degree of CI embedded in social tagging over time in terms of two-parameter values, number of participants, and top frequency ranking window. Five different metrics were adopted and utilized for assessing the similarity between ranking lists: overlapList, overlapRank, Footrule, Fagin's measure, and the Inverse Rank measure. The result of this study demonstrates that a substantial degree of CI is most likely to be achieved when somewhere between the first 200 and 400 people have participated in tagging, and that a target degree of CI can be projected by controlling the two factors along with the selection of a similarity metric. The study also tests some experimental conditions for detecting social tags with high CI degree. The results of this study can be applicable to the study of filtering social tags based on CI; filtered social tags may be utilized for the metadata creation of tagged resources and possibly for the retrieval of tagged resources.
    25.12.2012 15:22:37
  2. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00
    22. 9.2008 18:31:54
  3. Yi, K.; Chan, L.M.: ¬A visualization software tool for Library of Congress Subject Headings (2008) 0.00
    The aim of this study is to develop a software tool, VisuaLCSH, for effective searching, browsing, and maintenance of LCSH. This tool enables visualizing subject headings and hierarchical structures implied and embedded in LCSH. A conceptual framework for converting the hierarchical structure of headings in LCSH to an explicit tree structure is proposed, described, and implemented. The highlights of VisuaLCSH are summarized below: 1) revealing multiple aspects of a heading; 2) normalizing the hierarchical relationships in LCSH; 3) showing multi-level hierarchies in LCSH sub-trees; 4) improving the navigational function of LCSH in retrieval; and 5) enabling the implementation of generic search, i.e., the 'exploding' feature, in searching LCSH.
    Advances in knowledge organization; vol.11
    Culture and identity in knowledge organization: Proceedings of the Tenth International ISKO Conference 5-8 August 2008, Montreal, Canada. Ed. by Clément Arsenault and Joseph T. Tennis
  4. Yi, K.; Choi, N.; Kim, Y.S.: ¬A content analysis of Twitter hyperlinks and their application in web resource indexing (2016) 0.00
    Twitter has emerged as a popular source of sharing and delivering news information. In tweet messages, URLs to web resources and hashtags are often included. This study investigates the potential of the hyperlinks and hashtags as topical clues and indicators to tweet messages. For this study, we crawled and analyzed about 1.5 million tweets for a 3-month period covering any topic or subject. The findings of this study revealed a power law relationship for the ranking and frequency of (a) the host names of URLs, and (b) a pair of hashtags and URLs that appeared in the tweet messages. This study also discovered that the most popular URLs used in tweets come from news and media websites, and a majority of the hyperlinked resources are news web pages. One implication of this study is that Twitter users are becoming more active in sharing already published information than producing new information. Finally, our investigation on hashtags for web resource indexing reveals that hashtags have the potential to be used as indexing terms for co-occurring URLs in the same tweet. We also discuss the implications of this study for web resource recommendation.
  5. Yi, K.: Challenges in automated classification using library classification schemes (2006) 0.00
    A major library classification scheme has long been standard classification framework for information sources in traditional library environment, and text classification (TC) becomes a popular and attractive tool of organizing digital information. This paper gives an overview of previous projects and studies on TC using major library classification schemes, and summarizes a discussion of TC research challenges.
  6. Yi, K.; Chan, L.M.: Revisiting the syntactical and structural analysis of Library of Congress Subject Headings for the digital environment (2010) 0.00
    With the current information environment characterized by the proliferation of digital resources, including collaboratively created and shared resources, Library of Congress Subject Headings (LCSH) is facing the challenges of effective and efficient subject-based organization and retrieval of digital resources. To explore the feasibility of utilizing LCSH in a digital environment, we might need to revisit its basic characteristics. The objectives of our study were to analyze LCSH in both syntactic and relational structures, to discover the structural characteristics of LCSH, and to identify problems and issues for the feasibility of LCSH as an effective subject access tool. This study reports and discusses issues raised by the syntactic and hierarchical structures of LCSH that present challenges to its use in a networked environment. Given the results of this study, we recommend a number of provisional future directions for the development of LCSH towards further becoming a viable system for digital and networked resources.
  7. Yi, K.; Beheshti, J.; Cole, C.; Leide, J.E.; Large, A.: User search behavior of domain-specific information retrieval systems : an analysis of the query logs from PsycINFO and ABC-Clio's Historical Abstracts/America: History and Life (2006) 0.00
    The authors report the findings of a study that analyzes and compares the query logs of PsycINFO for psychology and the two history databases of ABC-Clio: Historical Abstracts and America: History and Life to establish the sociological nature of information need, searching, and seeking in history versus psychology. Two problems are addressed: (a) What level of query log analysis - by individual query terms, by co-occurrence of word pairs, or by multiword terms (MWTs) - best serves as data for categorizing the queries to these two subject-bound databases; and (b) how can the differences in the nature of the queries to history versus psychology databases aid in our understanding of user search behavior and the information needs of their respective users. The authors conclude that MWTs provide the most effective snapshot of user searching behavior for query categorization. The MWTs to ABC-Clio indicate specific instances of historical events, people, and regions, whereas the MWTs to PsycINFO indicate concepts roughly equivalent to descriptors used by PsycINFO's own classification scheme. The average length of queries is 3.16 terms for PsycINFO and 3.42 for ABC-Clio, which breaks from findings for other reference and scholarly search engine studies, bringing query length closer in line to findings for general Web search engines like Excite.
  8. Yi, K.; Chan, L.M.: Linking folksonomy to Library of Congress subject headings : an exploratory study (2009) 0.00
    Purpose - The purpose of this paper is to investigate the linking of a folksonomy (user vocabulary) and LCSH (controlled vocabulary) on the basis of word matching, for the potential use of LCSH in bringing order to folksonomies. Design/methodology/approach - A selected sample of a folksonomy from a popular collaborative tagging system, Delicious, was word-matched with LCSH. LCSH was transformed into a tree structure called an LCSH tree for the matching. A close examination was conducted on the characteristics of folksonomies, the overlap of folksonomies with LCSH, and the distribution of folksonomies over the LCSH tree. Findings - The experimental results showed that the total proportion of tags being matched with LC subject headings constituted approximately two-thirds of all tags involved, with an additional 10 percent of the remaining tags having potential matches. A number of barriers for the linking as well as two areas in need of improving the matching are identified and described. Three important tag distribution patterns over the LCSH tree were identified and supported: skewedness, multifacet, and Zipfian-pattern. Research limitations/implications - The results of the study can be adopted for the development of innovative methods of mapping between folksonomy and LCSH, which directly contributes to effective access and retrieval of tagged web resources and to the integration of multiple information repositories based on the two vocabularies. Practical implications - The linking of controlled vocabularies can be applicable to enhance information retrieval capability within collaborative tagging systems as well as across various tagging system information depositories and bibliographic databases. Originality/value - This is among frontier works that examines the potential of linking a folksonomy, extracted from a collaborative tagging system, to an authority-maintained subject heading system. It provides exploratory data to support further advanced mapping methods for linking the two vocabularies.
  9. Yi, K.: ¬A semantic similarity approach to predicting Library of Congress subject headings for social tags (2010) 0.00
    Social tagging or collaborative tagging has become a new trend in the organization, management, and discovery of digital information. The rapid growth of shared information mostly controlled by social tags poses a new challenge for social tag-based information organization and retrieval. A plausible approach for this challenge is linking social tags to a controlled vocabulary. As an introductory step for this approach, this study investigates ways of predicting relevant subject headings for resources from social tags assigned to the resources. The prediction of subject headings was measured by five different similarity measures: tf-idf, cosine-based similarity (CoS), Jaccard similarity (or Jaccard coefficient; JS), Mutual information (MI), and information radius (IRad). Their results were compared to those by professionals. The results show that a CoS measure based on top five social tags was most effective. Inclusions of more social tags only aggravate the performance. The performance of JS is comparable to the performance of CoS while tf-idf is comparable with up to 70% less than the best performance. MI and IRad have inferior performance compared to the other methods. This study demonstrates the application of the similarity measuring techniques to the prediction of correct Library of Congress subject headings.