Search (5 results, page 1 of 1)

  • × author_ss:"Yi, K."
  • × year_i:[2000 TO 2010}
  1. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.03
    0.02713491 = product of:
      0.05426982 = sum of:
        0.05426982 = sum of:
          0.010589487 = weight(_text_:a in 2560) [ClassicSimilarity], result of:
            0.010589487 = score(doc=2560,freq=10.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.19940455 = fieldWeight in 2560, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2560)
          0.043680333 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
            0.043680333 = score(doc=2560,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.2708308 = fieldWeight in 2560, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2560)
      0.5 = coord(1/2)
    
    Abstract
    The proliferation of digital resources and their integration into a traditional library setting has created a pressing need for an automated tool that organizes textual information based on library classification schemes. Automated text classification is a research field of developing tools, methods, and models to automate text classification. This article describes the current popular approach for text classification and major text classification projects and applications that are based on library classification schemes. Related issues and challenges are discussed, and a number of considerations for the challenges are examined.
    Date
    22. 9.2008 18:31:54
    Type
    a
  2. Yi, K.: Challenges in automated classification using library classification schemes (2006) 0.00
    0.00270615 = product of:
      0.0054123 = sum of:
        0.0054123 = product of:
          0.0108246 = sum of:
            0.0108246 = weight(_text_:a in 5810) [ClassicSimilarity], result of:
              0.0108246 = score(doc=5810,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.20383182 = fieldWeight in 5810, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5810)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A major library classification scheme has long been standard classification framework for information sources in traditional library environment, and text classification (TC) becomes a popular and attractive tool of organizing digital information. This paper gives an overview of previous projects and studies on TC using major library classification schemes, and summarizes a discussion of TC research challenges.
    Language
    a
  3. Yi, K.; Chan, L.M.: ¬A visualization software tool for Library of Congress Subject Headings (2008) 0.00
    0.002269176 = product of:
      0.004538352 = sum of:
        0.004538352 = product of:
          0.009076704 = sum of:
            0.009076704 = weight(_text_:a in 2503) [ClassicSimilarity], result of:
              0.009076704 = score(doc=2503,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1709182 = fieldWeight in 2503, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2503)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    The aim of this study is to develop a software tool, VisuaLCSH, for effective searching, browsing, and maintenance of LCSH. This tool enables visualizing subject headings and hierarchical structures implied and embedded in LCSH. A conceptual framework for converting the hierarchical structure of headings in LCSH to an explicit tree structure is proposed, described, and implemented. The highlights of VisuaLCSH are summarized below: 1) revealing multiple aspects of a heading; 2) normalizing the hierarchical relationships in LCSH; 3) showing multi-level hierarchies in LCSH sub-trees; 4) improving the navigational function of LCSH in retrieval; and 5) enabling the implementation of generic search, i.e., the 'exploding' feature, in searching LCSH.
    Type
    a
  4. Yi, K.; Chan, L.M.: Linking folksonomy to Library of Congress subject headings : an exploratory study (2009) 0.00
    0.0021393995 = product of:
      0.004278799 = sum of:
        0.004278799 = product of:
          0.008557598 = sum of:
            0.008557598 = weight(_text_:a in 3616) [ClassicSimilarity], result of:
              0.008557598 = score(doc=3616,freq=20.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.16114321 = fieldWeight in 3616, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3616)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - The purpose of this paper is to investigate the linking of a folksonomy (user vocabulary) and LCSH (controlled vocabulary) on the basis of word matching, for the potential use of LCSH in bringing order to folksonomies. Design/methodology/approach - A selected sample of a folksonomy from a popular collaborative tagging system, Delicious, was word-matched with LCSH. LCSH was transformed into a tree structure called an LCSH tree for the matching. A close examination was conducted on the characteristics of folksonomies, the overlap of folksonomies with LCSH, and the distribution of folksonomies over the LCSH tree. Findings - The experimental results showed that the total proportion of tags being matched with LC subject headings constituted approximately two-thirds of all tags involved, with an additional 10 percent of the remaining tags having potential matches. A number of barriers for the linking as well as two areas in need of improving the matching are identified and described. Three important tag distribution patterns over the LCSH tree were identified and supported: skewedness, multifacet, and Zipfian-pattern. Research limitations/implications - The results of the study can be adopted for the development of innovative methods of mapping between folksonomy and LCSH, which directly contributes to effective access and retrieval of tagged web resources and to the integration of multiple information repositories based on the two vocabularies. Practical implications - The linking of controlled vocabularies can be applicable to enhance information retrieval capability within collaborative tagging systems as well as across various tagging system information depositories and bibliographic databases. Originality/value - This is among frontier works that examines the potential of linking a folksonomy, extracted from a collaborative tagging system, to an authority-maintained subject heading system. It provides exploratory data to support further advanced mapping methods for linking the two vocabularies.
    Type
    a
  5. Yi, K.; Beheshti, J.; Cole, C.; Leide, J.E.; Large, A.: User search behavior of domain-specific information retrieval systems : an analysis of the query logs from PsycINFO and ABC-Clio's Historical Abstracts/America: History and Life (2006) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 197) [ClassicSimilarity], result of:
              0.006765375 = score(doc=197,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 197, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=197)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The authors report the findings of a study that analyzes and compares the query logs of PsycINFO for psychology and the two history databases of ABC-Clio: Historical Abstracts and America: History and Life to establish the sociological nature of information need, searching, and seeking in history versus psychology. Two problems are addressed: (a) What level of query log analysis - by individual query terms, by co-occurrence of word pairs, or by multiword terms (MWTs) - best serves as data for categorizing the queries to these two subject-bound databases; and (b) how can the differences in the nature of the queries to history versus psychology databases aid in our understanding of user search behavior and the information needs of their respective users. The authors conclude that MWTs provide the most effective snapshot of user searching behavior for query categorization. The MWTs to ABC-Clio indicate specific instances of historical events, people, and regions, whereas the MWTs to PsycINFO indicate concepts roughly equivalent to descriptors used by PsycINFO's own classification scheme. The average length of queries is 3.16 terms for PsycINFO and 3.42 for ABC-Clio, which breaks from findings for other reference and scholarly search engine studies, bringing query length closer in line to findings for general Web search engines like Excite.
    Type
    a