Search (11 results, page 1 of 1)

  • × author_ss:"Golub, K."
  1. Golub, K.: Automated subject classification of textual Web pages, based on a controlled vocabulary : challenges and recommendations (2006) 0.02
    0.015136535 = product of:
      0.09586473 = sum of:
        0.036411904 = weight(_text_:web in 5897) [ClassicSimilarity], result of:
          0.036411904 = score(doc=5897,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.43268442 = fieldWeight in 5897, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5897)
        0.036411904 = weight(_text_:web in 5897) [ClassicSimilarity], result of:
          0.036411904 = score(doc=5897,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.43268442 = fieldWeight in 5897, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5897)
        0.023040922 = weight(_text_:services in 5897) [ClassicSimilarity], result of:
          0.023040922 = score(doc=5897,freq=2.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.2433798 = fieldWeight in 5897, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=5897)
      0.15789473 = coord(3/19)
    
    Abstract
    The primary objective of this study was to identify and address problems of applying a controlled vocabulary in automated subject classification of textual Web pages, in the area of engineering. Web pages have special characteristics such as structural information, but are at the same time rather heterogeneous. The classification approach used comprises string-to-string matching between words in a term list extracted from the Ei (Engineering Information) thesaurus and classification scheme, and words in the text to be classified. Based on a sample of 70 Web pages, a number of problems with the term list are identified. Reasons for those problems are discussed and improvements proposed. Methods for implementing the improvements are also specified, suggesting further research.
    Content
    Beitrag eines Themenheftes "Knowledge organization systems and services"
  2. Koch, T.; Golub, K.; Ardö, A.: Users browsing behaviour in a DDC-based Web service : a log analysis (2006) 0.02
    0.015136535 = product of:
      0.09586473 = sum of:
        0.036411904 = weight(_text_:web in 2234) [ClassicSimilarity], result of:
          0.036411904 = score(doc=2234,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.43268442 = fieldWeight in 2234, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2234)
        0.036411904 = weight(_text_:web in 2234) [ClassicSimilarity], result of:
          0.036411904 = score(doc=2234,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.43268442 = fieldWeight in 2234, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2234)
        0.023040922 = weight(_text_:services in 2234) [ClassicSimilarity], result of:
          0.023040922 = score(doc=2234,freq=2.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.2433798 = fieldWeight in 2234, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=2234)
      0.15789473 = coord(3/19)
    
    Abstract
    This study explores the navigation behaviour of all users of a large web service, Renardus, using web log analysis. Renardus provides integrated searching and browsing access to quality-controlled web resources from major individual subject gateway services. The main navigation feature is subject browsing through the Dewey Decimal Classification (DDC) based on mapping of classes of resources from the distributed gateways to the DDC structure. Among the more surprising results are the hugely dominant share of browsing activities, the good use of browsing support features like the graphical fish-eye overviews, rather long and varied navigation sequences, as well as extensive hierarchical directory-style browsing through the large DDC system.
  3. Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.01
    0.0146696465 = product of:
      0.092907764 = sum of:
        0.032584786 = weight(_text_:services in 1347) [ClassicSimilarity], result of:
          0.032584786 = score(doc=1347,freq=4.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.344191 = fieldWeight in 1347, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=1347)
        0.049841963 = weight(_text_:semantische in 1347) [ClassicSimilarity], result of:
          0.049841963 = score(doc=1347,freq=2.0), product of:
            0.13923967 = queryWeight, product of:
              5.399778 = idf(docFreq=542, maxDocs=44218)
              0.025786186 = queryNorm
            0.35795808 = fieldWeight in 1347, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.399778 = idf(docFreq=542, maxDocs=44218)
              0.046875 = fieldNorm(doc=1347)
        0.010481017 = product of:
          0.020962033 = sum of:
            0.020962033 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
              0.020962033 = score(doc=1347,freq=2.0), product of:
                0.09029883 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.025786186 = queryNorm
                0.23214069 = fieldWeight in 1347, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1347)
          0.5 = coord(1/2)
      0.15789473 = coord(3/19)
    
    Abstract
    Terminology registries (TRs) are a crucial element of the infrastructure required for resource discovery services, digital libraries, Linked Data, and semantic interoperability generally. They can make the content of knowledge organization systems (KOS) available both for human and machine access. The paper describes the attributes and functionality for a TR, based on a review of published literature, existing TRs, and a survey of experts. A domain model based on user tasks is constructed and a set of core metadata elements for use in TRs is proposed. Ideally, the TR should allow searching as well as browsing for a KOS, matching a user's search while also providing information about existing terminology services, accessible to both humans and machines. The issues surrounding metadata for KOS are also discussed, together with the rationale for different aspects and the importance of a core set of KOS metadata for future machine-based access; a possible core set of metadata elements is proposed. This is dealt with in terms of practical experience and in relation to the Dublin Core Application Profile.
    Date
    22. 8.2014 17:12:54
    Theme
    Semantische Interoperabilität
  4. Golub, K.: Subject access in Swedish discovery services (2018) 0.01
    0.008414369 = product of:
      0.079936504 = sum of:
        0.038401537 = weight(_text_:services in 4379) [ClassicSimilarity], result of:
          0.038401537 = score(doc=4379,freq=8.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.405633 = fieldWeight in 4379, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4379)
        0.041534968 = weight(_text_:semantische in 4379) [ClassicSimilarity], result of:
          0.041534968 = score(doc=4379,freq=2.0), product of:
            0.13923967 = queryWeight, product of:
              5.399778 = idf(docFreq=542, maxDocs=44218)
              0.025786186 = queryNorm
            0.2982984 = fieldWeight in 4379, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.399778 = idf(docFreq=542, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4379)
      0.10526316 = coord(2/19)
    
    Abstract
    While support for subject searching has been traditionally advocated for in library catalogs, often in the form of a catalog objective to find everything that a library has on a certain topic, research has shown that subject access has not been satisfactory. Many existing online catalogs and discovery services do not seem to make good use of the intellectual effort invested into assigning controlled subject index terms and classes. For example, few support hierarchical browsing of classification schemes and other controlled vocabularies with hierarchical structures, few provide end-user-friendly options to choose a more specific concept to increase precision, a broader concept or related concepts to increase recall, to disambiguate homonyms, or to find which term is best used to name a concept. Optimum subject access in library catalogs and discovery services is analyzed from the perspective of earlier research as well as contemporary conceptual models and cataloguing codes. Eighteen proposed features of what this should entail in practice are drawn. In an exploratory qualitative study, the three most common discovery services used in Swedish academic libraries are analyzed against these features. In line with previous research, subject access in contemporary interfaces is demonstrated to less than optimal. This is in spite of the fact that individual collections have been indexed with controlled vocabularies and a significant number of controlled vocabularies have been mapped to each other and are available in interoperable standards. Strategic action is proposed to build research-informed (inter)national standards and guidelines.
    Theme
    Semantische Interoperabilität
  5. Golub, K.: Automated subject classification of textual web documents (2006) 0.01
    0.0063880538 = product of:
      0.06068651 = sum of:
        0.030343255 = weight(_text_:web in 5600) [ClassicSimilarity], result of:
          0.030343255 = score(doc=5600,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.36057037 = fieldWeight in 5600, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5600)
        0.030343255 = weight(_text_:web in 5600) [ClassicSimilarity], result of:
          0.030343255 = score(doc=5600,freq=8.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.36057037 = fieldWeight in 5600, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5600)
      0.10526316 = coord(2/19)
    
    Abstract
    Purpose - To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such. Design/methodology/approach - A range of works dealing with automated classification of full-text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages. Findings - Provides major similarities and differences between the three approaches: document pre-processing and utilization of web-specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized. Research limitations/implications - The paper does not attempt to provide an exhaustive bibliography of related resources. Practical implications - As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities. Originality/value - To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
  6. Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.01
    0.005420443 = product of:
      0.05149421 = sum of:
        0.025747105 = weight(_text_:web in 4558) [ClassicSimilarity], result of:
          0.025747105 = score(doc=4558,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.3059541 = fieldWeight in 4558, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4558)
        0.025747105 = weight(_text_:web in 4558) [ClassicSimilarity], result of:
          0.025747105 = score(doc=4558,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.3059541 = fieldWeight in 4558, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4558)
      0.10526316 = coord(2/19)
    
    Abstract
    While automated methods for information organization have been around for several decades now, exponential growth of the World Wide Web has put them into the forefront of research in different communities, within which several approaches can be identified: 1) machine learning (algorithms that allow computers to improve their performance based on learning from pre-existing data); 2) document clustering (algorithms for unsupervised document organization and automated topic extraction); and 3) string matching (algorithms that match given strings within larger text). Here the aim was to automatically organize textual documents into hierarchical structures for subject browsing. The string-matching approach was tested using a controlled vocabulary (containing pre-selected and pre-defined authorized terms, each corresponding to only one concept). The results imply that an appropriate controlled vocabulary, with a sufficient number of entry terms designating classes, could in itself be a solution for automated classification. Then, if the same controlled vocabulary had an appropriat hierarchical structure, it would at the same time provide a good browsing structure for the collection of automatically classified documents.
  7. Golub, K.; Lykke, M.: Automated classification of web pages in hierarchical browsing (2009) 0.00
    0.004517036 = product of:
      0.042911842 = sum of:
        0.021455921 = weight(_text_:web in 3614) [ClassicSimilarity], result of:
          0.021455921 = score(doc=3614,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.25496176 = fieldWeight in 3614, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3614)
        0.021455921 = weight(_text_:web in 3614) [ClassicSimilarity], result of:
          0.021455921 = score(doc=3614,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.25496176 = fieldWeight in 3614, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3614)
      0.10526316 = coord(2/19)
    
    Abstract
    Purpose - The purpose of this study is twofold: to investigate whether it is meaningful to use the Engineering Index (Ei) classification scheme for browsing, and then, if proven useful, to investigate the performance of an automated classification algorithm based on the Ei classification scheme. Design/methodology/approach - A user study was conducted in which users solved four controlled searching tasks. The users browsed the Ei classification scheme in order to examine the suitability of the classification systems for browsing. The classification algorithm was evaluated by the users who judged the correctness of the automatically assigned classes. Findings - The study showed that the Ei classification scheme is suited for browsing. Automatically assigned classes were on average partly correct, with some classes working better than others. Success of browsing showed to be correlated and dependent on classification correctness. Research limitations/implications - Further research should address problems of disparate evaluations of one and the same web page. Additional reasons behind browsing failures in the Ei classification scheme also need further investigation. Practical implications - Improvements for browsing were identified: describing class captions and/or listing their subclasses from start; allowing for searching for words from class captions with synonym search (easily provided for Ei since the classes are mapped to thesauri terms); when searching for class captions, returning the hierarchical tree expanded around the class in which caption the search term is found. The need for improvements of classification schemes was also indicated. Originality/value - A user-based evaluation of automated subject classification in the context of browsing has not been conducted before; hence the study also presents new findings concerning methodology.
  8. Johansson, S.; Golub, K.: LibraryThing for libraries : how tag moderation and size limitations affect tag clouds (2019) 0.00
    0.004517036 = product of:
      0.042911842 = sum of:
        0.021455921 = weight(_text_:web in 5398) [ClassicSimilarity], result of:
          0.021455921 = score(doc=5398,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.25496176 = fieldWeight in 5398, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5398)
        0.021455921 = weight(_text_:web in 5398) [ClassicSimilarity], result of:
          0.021455921 = score(doc=5398,freq=4.0), product of:
            0.08415349 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.025786186 = queryNorm
            0.25496176 = fieldWeight in 5398, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5398)
      0.10526316 = coord(2/19)
    
    Abstract
    The aim of this study is to analyse differences between tags on LibraryThing's web page and tag clouds in their "Library-Thing for Libraries" service, and assess if, and how, the Library-Thing tag moderation and limitations to the size of the tag cloud in the library catalogue affect the description of the information resource. An e-mail survey was conducted with personnel at LibraryThing, and the results were compared against tags for twenty different fiction books, collected from two different library catalogues with disparate tag cloud sizes, and Library-Thing's web page. The data were analysed using a modified version of Golder and Huberman's tag categories (2006). The results show that while LibraryThing claims to only remove the inherently personal tags, several other types of tags are found to have been discarded as well. Occasionally a certain type of tag is in-cluded in one book, and excluded in another. The comparison between the two tag cloud sizes suggests that the larger tag clouds provide a more pronounced picture regarding the contents of the book but at the cost of an increase in the number of tags with synonymous or redundant information.
  9. Golub, K.; Tyrkkö, J.; Hansson, J.; Ahlström, I.: Subject indexing in humanities : a comparison between a local university repository and an international bibliographic service (2020) 0.00
    0.0020211334 = product of:
      0.038401537 = sum of:
        0.038401537 = weight(_text_:services in 5982) [ClassicSimilarity], result of:
          0.038401537 = score(doc=5982,freq=8.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.405633 = fieldWeight in 5982, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5982)
      0.05263158 = coord(1/19)
    
    Abstract
    As the humanities develop in the realm of increasingly more pronounced digital scholarship, it is important to provide quality subject access to a vast range of heterogeneous information objects in digital services. The study aims to paint a representative picture of the current state of affairs of the use of subject index terms in humanities journal articles with particular reference to the well-established subject access needs of humanities researchers, with the purpose of identifying which improvements are needed in this context. Design/methodology/approach The comparison of subject metadata on a sample of 649 peer-reviewed journal articles from across the humanities is conducted in a university repository, against Scopus, the former reflecting local and national policies and the latter being the most comprehensive international abstract and citation database of research output. Findings The study shows that established bibliographic objectives to ensure subject access for humanities journal articles are not supported in either the world's largest commercial abstract and citation database Scopus or the local repository of a public university in Sweden. The indexing policies in the two services do not seem to address the needs of humanities scholars for highly granular subject index terms with appropriate facets; no controlled vocabularies for any humanities discipline are used whatsoever. Originality/value In all, not much has changed since 1990s when indexing for the humanities was shown to lag behind the sciences. The community of researchers and information professionals, today working together on digital humanities projects, as well as interdisciplinary research teams, should demand that their subject access needs be fulfilled, especially in commercial services like Scopus and discovery services.
  10. Wartena, C.; Golub, K.: Evaluierung von Verschlagwortung im Kontext des Information Retrievals (2021) 0.00
    0.0018714602 = product of:
      0.035557743 = sum of:
        0.035557743 = weight(_text_:suche in 376) [ClassicSimilarity], result of:
          0.035557743 = score(doc=376,freq=2.0), product of:
            0.12883182 = queryWeight, product of:
              4.996156 = idf(docFreq=812, maxDocs=44218)
              0.025786186 = queryNorm
            0.27600124 = fieldWeight in 376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.996156 = idf(docFreq=812, maxDocs=44218)
              0.0390625 = fieldNorm(doc=376)
      0.05263158 = coord(1/19)
    
    Abstract
    Dieser Beitrag möchte einen Überblick über die in der Literatur diskutierten Möglichkeiten, Herausforderungen und Grenzen geben, Retrieval als eine extrinsische Evaluierungsmethode für die Ergebnisse verbaler Sacherschließung zu nutzen. Die inhaltliche Erschließung im Allgemeinen und die Verschlagwortung im Besonderen können intrinsisch oder extrinsisch evaluiert werden. Die intrinsische Evaluierung bezieht sich auf Eigenschaften der Erschließung, von denen vermutet wird, dass sie geeignete Indikatoren für die Qualität der Erschließung sind, wie formale Einheitlichkeit (im Hinblick auf die Anzahl zugewiesener Deskriptoren pro Dokument, auf die Granularität usw.), Konsistenz oder Übereinstimmung der Ergebnisse verschiedener Erschließer:innen. Bei einer extrinsischen Evaluierung geht es darum, die Qualität der gewählten Deskriptoren daran zu messen, wie gut sie sich tatsächlich bei der Suche bewähren. Obwohl die extrinsische Evaluierung direktere Auskunft darüber gibt, ob die Erschließung ihren Zweck erfüllt, und daher den Vorzug verdienen sollte, ist sie kompliziert und oft problematisch. In einem Retrievalsystem greifen verschiedene Algorithmen und Datenquellen in vielschichtiger Weise ineinander und interagieren bei der Evaluierung darüber hinaus noch mit Nutzer:innen und Rechercheaufgaben. Die Evaluierung einer Komponente im System kann nicht einfach dadurch vorgenommen werden, dass man sie austauscht und mit einer anderen Komponente vergleicht, da die gleiche Ressource oder der gleiche Algorithmus sich in unterschiedlichen Umgebungen unterschiedlich verhalten kann. Wir werden relevante Evaluierungsansätze vorstellen und diskutieren, und zum Abschluss einige Empfehlungen für die Evaluierung von Verschlagwortung im Kontext von Retrieval geben.
  11. Golub, K.; Ziolkowski, P.M.; Zlodi, G.: Organizing subject access to cultural heritage in Swedish online museums (2022) 0.00
    8.084534E-4 = product of:
      0.015360615 = sum of:
        0.015360615 = weight(_text_:services in 688) [ClassicSimilarity], result of:
          0.015360615 = score(doc=688,freq=2.0), product of:
            0.094670646 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.025786186 = queryNorm
            0.1622532 = fieldWeight in 688, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.03125 = fieldNorm(doc=688)
      0.05263158 = coord(1/19)
    
    Abstract
    Purpose The study aims to paint a representative picture of the current state of search interfaces of Swedish online museum collections, focussing on search functionalities with particular reference to subject searching, as well as the use of controlled vocabularies, with the purpose of identifying which improvements of the search interfaces are needed to ensure high-quality information retrieval for the end user. Design/methodology/approach In the first step, a set of 21 search interface criteria was identified, based on related research and current standards in the domain of cultural heritage knowledge organization. Secondly, a complete set of Swedish museums that provide online access to their collections was identified, comprising nine cross-search services and 91 individual museums' websites. These 100 websites were each evaluated against the 21 criteria, between 1 July and 31 August 2020. Findings Although many standards and guidelines are in place to ensure quality-controlled subject indexing, which in turn support information retrieval of relevant resources (as individual or full search results), the study shows that they are not broadly implemented, resulting in information retrieval failures for the end user. The study also demonstrates a strong need for the implementation of controlled vocabularies in these museums. Originality/value This study is a rare piece of research which examines subject searching in online museums; the 21 search criteria and their use in the analysis of the complete set of online collections of a country represents a considerable and unique contribution to the fields of knowledge organization and information retrieval of cultural heritage. Its particular value lies in showing how the needs of end users, many of which are documented and reflected in international standards and guidelines, should be taken into account in designing search tools for these museums; especially so in subject searching, which is the most complex and yet the most common type of search. Much effort has been invested into digitizing cultural heritage collections, but access to them is hindered by poor search functionality. This study identifies which are the most important aspects to improve.