Search (7 results, page 1 of 1)

  • × author_ss:"Golub, K."
  1. Golub, K.; Lykke, M.: Automated classification of web pages in hierarchical browsing (2009) 0.01
    0.012456408 = product of:
      0.03736922 = sum of:
        0.03736922 = product of:
          0.07473844 = sum of:
            0.07473844 = weight(_text_:methodology in 3614) [ClassicSimilarity], result of:
              0.07473844 = score(doc=3614,freq=4.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.35193008 = fieldWeight in 3614, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3614)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this study is twofold: to investigate whether it is meaningful to use the Engineering Index (Ei) classification scheme for browsing, and then, if proven useful, to investigate the performance of an automated classification algorithm based on the Ei classification scheme. Design/methodology/approach - A user study was conducted in which users solved four controlled searching tasks. The users browsed the Ei classification scheme in order to examine the suitability of the classification systems for browsing. The classification algorithm was evaluated by the users who judged the correctness of the automatically assigned classes. Findings - The study showed that the Ei classification scheme is suited for browsing. Automatically assigned classes were on average partly correct, with some classes working better than others. Success of browsing showed to be correlated and dependent on classification correctness. Research limitations/implications - Further research should address problems of disparate evaluations of one and the same web page. Additional reasons behind browsing failures in the Ei classification scheme also need further investigation. Practical implications - Improvements for browsing were identified: describing class captions and/or listing their subclasses from start; allowing for searching for words from class captions with synonym search (easily provided for Ei since the classes are mapped to thesauri terms); when searching for class captions, returning the hierarchical tree expanded around the class in which caption the search term is found. The need for improvements of classification schemes was also indicated. Originality/value - A user-based evaluation of automated subject classification in the context of browsing has not been conducted before; hence the study also presents new findings concerning methodology.
  2. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.01
    0.00880801 = product of:
      0.02642403 = sum of:
        0.02642403 = product of:
          0.05284806 = sum of:
            0.05284806 = weight(_text_:methodology in 2918) [ClassicSimilarity], result of:
              0.05284806 = score(doc=2918,freq=2.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.24885213 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
  3. Golub, K.: Automated subject classification of textual web documents (2006) 0.01
    0.00880801 = product of:
      0.02642403 = sum of:
        0.02642403 = product of:
          0.05284806 = sum of:
            0.05284806 = weight(_text_:methodology in 5600) [ClassicSimilarity], result of:
              0.05284806 = score(doc=5600,freq=2.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.24885213 = fieldWeight in 5600, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5600)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such. Design/methodology/approach - A range of works dealing with automated classification of full-text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages. Findings - Provides major similarities and differences between the three approaches: document pre-processing and utilization of web-specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized. Research limitations/implications - The paper does not attempt to provide an exhaustive bibliography of related resources. Practical implications - As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities. Originality/value - To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
  4. Matthews, B.; Jones, C.; Puzon, B.; Moon, J.; Tudhope, D.; Golub, K.; Nielsen, M.L.: ¬An evaluation of enhancing social tagging with a knowledge organization system (2010) 0.01
    0.00880801 = product of:
      0.02642403 = sum of:
        0.02642403 = product of:
          0.05284806 = sum of:
            0.05284806 = weight(_text_:methodology in 4171) [ClassicSimilarity], result of:
              0.05284806 = score(doc=4171,freq=2.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.24885213 = fieldWeight in 4171, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4171)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - Traditional subject indexing and classification are considered infeasible in many digital collections. This paper seeks to investigate ways of enhancing social tagging via knowledge organization systems, with a view to improving the quality of tags for increased information discovery and retrieval performance. Design/methodology/approach - Enhanced tagging interfaces were developed for exemplar online repositories, and trials were undertaken with author and reader groups to evaluate the effectiveness of tagging augmented with control vocabulary for subject indexing of papers in online repositories. Findings - The results showed that using a knowledge organisation system to augment tagging does appear to increase the effectiveness of non-specialist users (that is, without information science training) in subject indexing. Research limitations/implications - While limited by the size and scope of the trials undertaken, these results do point to the usefulness of a mixed approach in supporting the subject indexing of online resources. Originality/value - The value of this work is as a guide to future developments in the practical support for resource indexing in online repositories.
  5. Golub, K.; Tyrkkö, J.; Hansson, J.; Ahlström, I.: Subject indexing in humanities : a comparison between a local university repository and an international bibliographic service (2020) 0.01
    0.00880801 = product of:
      0.02642403 = sum of:
        0.02642403 = product of:
          0.05284806 = sum of:
            0.05284806 = weight(_text_:methodology in 5982) [ClassicSimilarity], result of:
              0.05284806 = score(doc=5982,freq=2.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.24885213 = fieldWeight in 5982, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5982)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    As the humanities develop in the realm of increasingly more pronounced digital scholarship, it is important to provide quality subject access to a vast range of heterogeneous information objects in digital services. The study aims to paint a representative picture of the current state of affairs of the use of subject index terms in humanities journal articles with particular reference to the well-established subject access needs of humanities researchers, with the purpose of identifying which improvements are needed in this context. Design/methodology/approach The comparison of subject metadata on a sample of 649 peer-reviewed journal articles from across the humanities is conducted in a university repository, against Scopus, the former reflecting local and national policies and the latter being the most comprehensive international abstract and citation database of research output. Findings The study shows that established bibliographic objectives to ensure subject access for humanities journal articles are not supported in either the world's largest commercial abstract and citation database Scopus or the local repository of a public university in Sweden. The indexing policies in the two services do not seem to address the needs of humanities scholars for highly granular subject index terms with appropriate facets; no controlled vocabularies for any humanities discipline are used whatsoever. Originality/value In all, not much has changed since 1990s when indexing for the humanities was shown to lag behind the sciences. The community of researchers and information professionals, today working together on digital humanities projects, as well as interdisciplinary research teams, should demand that their subject access needs be fulfilled, especially in commercial services like Scopus and discovery services.
  6. Golub, K.; Ziolkowski, P.M.; Zlodi, G.: Organizing subject access to cultural heritage in Swedish online museums (2022) 0.01
    0.007046408 = product of:
      0.021139223 = sum of:
        0.021139223 = product of:
          0.042278446 = sum of:
            0.042278446 = weight(_text_:methodology in 688) [ClassicSimilarity], result of:
              0.042278446 = score(doc=688,freq=2.0), product of:
                0.21236731 = queryWeight, product of:
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.047143444 = queryNorm
                0.1990817 = fieldWeight in 688, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.504705 = idf(docFreq=1328, maxDocs=44218)
                  0.03125 = fieldNorm(doc=688)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose The study aims to paint a representative picture of the current state of search interfaces of Swedish online museum collections, focussing on search functionalities with particular reference to subject searching, as well as the use of controlled vocabularies, with the purpose of identifying which improvements of the search interfaces are needed to ensure high-quality information retrieval for the end user. Design/methodology/approach In the first step, a set of 21 search interface criteria was identified, based on related research and current standards in the domain of cultural heritage knowledge organization. Secondly, a complete set of Swedish museums that provide online access to their collections was identified, comprising nine cross-search services and 91 individual museums' websites. These 100 websites were each evaluated against the 21 criteria, between 1 July and 31 August 2020. Findings Although many standards and guidelines are in place to ensure quality-controlled subject indexing, which in turn support information retrieval of relevant resources (as individual or full search results), the study shows that they are not broadly implemented, resulting in information retrieval failures for the end user. The study also demonstrates a strong need for the implementation of controlled vocabularies in these museums. Originality/value This study is a rare piece of research which examines subject searching in online museums; the 21 search criteria and their use in the analysis of the complete set of online collections of a country represents a considerable and unique contribution to the fields of knowledge organization and information retrieval of cultural heritage. Its particular value lies in showing how the needs of end users, many of which are documented and reflected in international standards and guidelines, should be taken into account in designing search tools for these museums; especially so in subject searching, which is the most complex and yet the most common type of search. Much effort has been invested into digitizing cultural heritage collections, but access to them is hindered by poor search functionality. This study identifies which are the most important aspects to improve.
  7. Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.01
    0.006387286 = product of:
      0.019161858 = sum of:
        0.019161858 = product of:
          0.038323715 = sum of:
            0.038323715 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
              0.038323715 = score(doc=1347,freq=2.0), product of:
                0.16508831 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047143444 = queryNorm
                0.23214069 = fieldWeight in 1347, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1347)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 8.2014 17:12:54