Search (34 results, page 1 of 2)

  • × author_ss:"Tudhope, D."
  1. Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.02
    0.02174836 = product of:
      0.03262254 = sum of:
        0.014111955 = weight(_text_:to in 1347) [ClassicSimilarity], result of:
          0.014111955 = score(doc=1347,freq=4.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.17044228 = fieldWeight in 1347, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.046875 = fieldNorm(doc=1347)
        0.018510582 = product of:
          0.037021164 = sum of:
            0.037021164 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
              0.037021164 = score(doc=1347,freq=2.0), product of:
                0.15947726 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045541126 = queryNorm
                0.23214069 = fieldWeight in 1347, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1347)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Terminology registries (TRs) are a crucial element of the infrastructure required for resource discovery services, digital libraries, Linked Data, and semantic interoperability generally. They can make the content of knowledge organization systems (KOS) available both for human and machine access. The paper describes the attributes and functionality for a TR, based on a review of published literature, existing TRs, and a survey of experts. A domain model based on user tasks is constructed and a set of core metadata elements for use in TRs is proposed. Ideally, the TR should allow searching as well as browsing for a KOS, matching a user's search while also providing information about existing terminology services, accessible to both humans and machines. The issues surrounding metadata for KOS are also discussed, together with the rationale for different aspects and the importance of a core set of KOS metadata for future machine-based access; a possible core set of metadata elements is proposed. This is dealt with in terms of practical experience and in relation to the Dublin Core Application Profile.
    Date
    22. 8.2014 17:12:54
  2. Tudhope, D.: Knowledge Organization System Services : brief review of NKOS activities and possibility of KOS registries (2007) 0.01
    0.012340388 = product of:
      0.037021164 = sum of:
        0.037021164 = product of:
          0.07404233 = sum of:
            0.07404233 = weight(_text_:22 in 100) [ClassicSimilarity], result of:
              0.07404233 = score(doc=100,freq=2.0), product of:
                0.15947726 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045541126 = queryNorm
                0.46428138 = fieldWeight in 100, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=100)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 9.2007 15:41:14
  3. Tudhope, D.; Hodge, G.: Terminology registries (2007) 0.01
    0.010283656 = product of:
      0.03085097 = sum of:
        0.03085097 = product of:
          0.06170194 = sum of:
            0.06170194 = weight(_text_:22 in 539) [ClassicSimilarity], result of:
              0.06170194 = score(doc=539,freq=2.0), product of:
                0.15947726 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045541126 = queryNorm
                0.38690117 = fieldWeight in 539, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=539)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    26.12.2011 13:22:07
  4. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.01
    0.009193186 = product of:
      0.027579557 = sum of:
        0.027579557 = weight(_text_:to in 2918) [ClassicSimilarity], result of:
          0.027579557 = score(doc=2918,freq=22.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.33310217 = fieldWeight in 2918, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2918)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
  5. Tudhope, D.; Nielsen, M.L.: Introduction to knowledge organization systems and services (2006) 0.01
    0.0073336246 = product of:
      0.022000873 = sum of:
        0.022000873 = weight(_text_:to in 5913) [ClassicSimilarity], result of:
          0.022000873 = score(doc=5913,freq=14.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.2657236 = fieldWeight in 5913, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5913)
      0.33333334 = coord(1/3)
    
    Abstract
    In a very real sense, this special issue on Knowledge Organization Systems and Services is concerned with new applications, new contexts and new twists of old themes and problems. We are concerned with diverse attempts to apply the outcomes of much work over the years in artificial subject languages and their intellectual structures to facilitate access to digital information in various settings. This issue has its origins in NKOS workshops, held over the last two years in Bath, Vienna, Madrid and Denver, although the majority of contributions resulted from an open call for papers, disseminated in October 2005. NKOS (http://nkos.slis.kent.edu/) is an informal network whose general aim is to enable knowledge organization systems (KOS) to act as networked information services (both machine-to-machine and human facing), supporting the description and retrieval of information resources on the Internet. Since 1997, there has been an NKOS workshop each year, at either the JCDL or ECDL conference (2005 saw an NKOS workshop at both conferences and also at Dublin Core). Previous NKOS-related special issues have appeared in the online Journal of Digital Information in 2001 and 2004 (Hill and Koch 2001, Tudhope and Koch 2004).
  6. Matthews, B.; Jones, C.; Puzon, B.; Moon, J.; Tudhope, D.; Golub, K.; Nielsen, M.L.: ¬An evaluation of enhancing social tagging with a knowledge organization system (2010) 0.01
    0.0073336246 = product of:
      0.022000873 = sum of:
        0.022000873 = weight(_text_:to in 4171) [ClassicSimilarity], result of:
          0.022000873 = score(doc=4171,freq=14.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.2657236 = fieldWeight in 4171, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4171)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - Traditional subject indexing and classification are considered infeasible in many digital collections. This paper seeks to investigate ways of enhancing social tagging via knowledge organization systems, with a view to improving the quality of tags for increased information discovery and retrieval performance. Design/methodology/approach - Enhanced tagging interfaces were developed for exemplar online repositories, and trials were undertaken with author and reader groups to evaluate the effectiveness of tagging augmented with control vocabulary for subject indexing of papers in online repositories. Findings - The results showed that using a knowledge organisation system to augment tagging does appear to increase the effectiveness of non-specialist users (that is, without information science training) in subject indexing. Research limitations/implications - While limited by the size and scope of the trials undertaken, these results do point to the usefulness of a mixed approach in supporting the subject indexing of online resources. Originality/value - The value of this work is as a guide to future developments in the practical support for resource indexing in online repositories.
  7. Binding, C.; Tudhope, D.: Improving interoperability using vocabulary linked data (2015) 0.01
    0.0073336246 = product of:
      0.022000873 = sum of:
        0.022000873 = weight(_text_:to in 2205) [ClassicSimilarity], result of:
          0.022000873 = score(doc=2205,freq=14.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.2657236 = fieldWeight in 2205, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2205)
      0.33333334 = coord(1/3)
    
    Abstract
    The concept of Linked Data has been an emerging theme within the computing and digital heritage areas in recent years. The growth and scale of Linked Data has underlined the need for greater commonality in concept referencing, to avoid local redefinition and duplication of reference resources. Achieving domain-wide agreement on common vocabularies would be an unreasonable expectation; however, datasets often already have local vocabulary resources defined, and so the prospects for large-scale interoperability can be substantially improved by creating alignment links from these local vocabularies out to common external reference resources. The ARIADNE project is undertaking large-scale integration of archaeology dataset metadata records, to create a cross-searchable research repository resource. Key to enabling this cross search will be the 'subject' metadata originating from multiple data providers, containing terms from multiple multilingual controlled vocabularies. This paper discusses various aspects of vocabulary mapping. Experience from the previous SENESCHAL project in the publication of controlled vocabularies as Linked Open Data is discussed, emphasizing the importance of unique URI identifiers for vocabulary concepts. There is a need to align legacy indexing data to the uniquely defined concepts and examples are discussed of SENESCHAL data alignment work. A case study for the ARIADNE project presents work on mapping between vocabularies, based on the Getty Art and Architecture Thesaurus as a central hub and employing an interactive vocabulary mapping tool developed for the project, which generates SKOS mapping relationships in JSON and other formats. The potential use of such vocabulary mappings to assist cross search over archaeological datasets from different countries is illustrated in a pilot experiment. The results demonstrate the enhanced opportunities for interoperability and cross searching that the approach offers.
  8. Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: Compound descriptors in context : a matching function for classifications and thesauri (2002) 0.01
    0.006789617 = product of:
      0.02036885 = sum of:
        0.02036885 = weight(_text_:to in 3179) [ClassicSimilarity], result of:
          0.02036885 = score(doc=3179,freq=12.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24601223 = fieldWeight in 3179, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3179)
      0.33333334 = coord(1/3)
    
    Abstract
    There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This paper discusses a matching function for compound descriptors, or multi-concept subject headings, that does not rely an exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based an a measure of semantic closeness between terms, which has the potential to help with recall problems. The work reported is part of the ongoing FACET project in collaboration with the National Museum of Science and Industry and its collections database. The architecture of the prototype system and its Interface are outlined. The matching problem for compound descriptors is reviewed and the FACET implementation described. Results are discussed from scenarios using the faceted Getty Art and Architecture Thesaurus. We argue that automatic traversal of thesaurus relationships can augment the user's browsing possibilities. The techniques can be applied both to unstructured multi-concept subject headings and potentially to more syntactically structured strings. The notion of a focus term is used by the matching function to model AAT modified descriptors (noun phrases). The relevance of the approach to precoordinated indexing and matching faceted strings is discussed.
  9. Tudhope, D.; Binding, C.: Mapping between linked data vocabularies in ARIADNE (2015) 0.01
    0.006789617 = product of:
      0.02036885 = sum of:
        0.02036885 = weight(_text_:to in 2250) [ClassicSimilarity], result of:
          0.02036885 = score(doc=2250,freq=12.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24601223 = fieldWeight in 2250, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2250)
      0.33333334 = coord(1/3)
    
    Abstract
    Semantic Enrichment Enabling Sustainability of Archaeological Links (SENESCHAL) was a project coordinated by the Hypermedia Research Unit at the University of South Wales. The project aims included widening access to key vocabulary resources. National cultural heritage thesauri and vocabularies are used by both national organizations and local authority Historic Environment Records and could potentially act as vocabulary hubs for the Web of Data. Following completion, a set of prominent UK archaeological thesauri and vocabularies is now freely available as Linked Open Data (LOD) via http://www.heritagedata.org - together with open source web services and user interface controls. This presentation will reflect on work done to date for the ARIADNE FP7 infrastructure project (http://www.ariadne-infrastructure.eu) mapping between archaeological vocabularies in different languages and the utility of a hub architecture. The poly-hierarchical structure of the Getty Art & Architecture Thesaurus (AAT) was extracted for use as an example mediating structure to interconnect various multilingual vocabularies originating from ARIADNE data providers. Vocabulary resources were first converted to a common concept-based format (SKOS) and the concepts were then manually mapped to nodes of the extracted AAT structure using some judgement on the meaning of terms and scope notes. Results are presented along with reflections on the wider application to existing European archaeological vocabularies and associated online datasets.
  10. Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.01
    0.006789617 = product of:
      0.02036885 = sum of:
        0.02036885 = weight(_text_:to in 2300) [ClassicSimilarity], result of:
          0.02036885 = score(doc=2300,freq=12.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24601223 = fieldWeight in 2300, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.33333334 = coord(1/3)
    
    Abstract
    Subject terms play a crucial role in resource discovery but require substantial effort to produce. Automatic subject classification and indexing address problems of scale and sustainability and can be used to enrich existing bibliographic records, establish more connections across and between resources and enhance consistency of bibliographic data. The paper aims to put forward a complex methodological framework to evaluate automatic classification tools of Swedish textual documents based on the Dewey Decimal Classification (DDC) recently introduced to Swedish libraries. Three major complementary approaches are suggested: a quality-built gold standard, retrieval effects, domain analysis. The gold standard is built based on input from at least two catalogue librarians, end-users expert in the subject, end users inexperienced in the subject and automated tools. Retrieval effects are studied through a combination of assigned and free tasks, including factual and comprehensive types. The study also takes into consideration the different role and character of subject terms in various knowledge domains, such as scientific disciplines. As a theoretical framework, domain analysis is used and applied in relation to the implementation of DDC in Swedish libraries and chosen domains of knowledge within the DDC itself.
  11. Binding, C.; Tudhope, D.: KOS at your service : Programmatic access to knowledge organisation systems (2004) 0.01
    0.006652439 = product of:
      0.019957317 = sum of:
        0.019957317 = weight(_text_:to in 1342) [ClassicSimilarity], result of:
          0.019957317 = score(doc=1342,freq=2.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24104178 = fieldWeight in 1342, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.09375 = fieldNorm(doc=1342)
      0.33333334 = coord(1/3)
    
  12. Tudhope, D.: New Applications of Knowledge Organization Systems : introduction to a special issue (2004) 0.01
    0.006652439 = product of:
      0.019957317 = sum of:
        0.019957317 = weight(_text_:to in 2344) [ClassicSimilarity], result of:
          0.019957317 = score(doc=2344,freq=2.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24104178 = fieldWeight in 2344, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.09375 = fieldNorm(doc=2344)
      0.33333334 = coord(1/3)
    
  13. Binding, C.; Tudhope, D.: Terminology Web services (2010) 0.01
    0.006652439 = product of:
      0.019957317 = sum of:
        0.019957317 = weight(_text_:to in 4067) [ClassicSimilarity], result of:
          0.019957317 = score(doc=4067,freq=8.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.24104178 = fieldWeight in 4067, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.046875 = fieldNorm(doc=4067)
      0.33333334 = coord(1/3)
    
    Abstract
    Controlled terminologies such as classification schemes, name authorities, and thesauri have long been the domain of the library and information science community. Although historically there have been initiatives towards library style classification of web resources, there remain significant problems with searching and quality judgement of online content. Terminology services can play a key role in opening up access to these valuable resources. By exposing controlled terminologies via a web service, organisations maintain data integrity and version control, whilst motivating external users to design innovative ways to present and utilise their data. We introduce terminology web services and review work in the area. We describe the approaches taken in establishing application programming interfaces (API) and discuss the comparative benefits of a dedicated terminology web service versus general purpose programming languages. We discuss experiences at Glamorgan in creating terminology web services and associated client interface components, in particular for the archaeology domain in the STAR (Semantic Technologies for Archaeological Resources) Project.
    Content
    Teil von: Papers from Classification at a Crossroads: Multiple Directions to Usability: International UDC Seminar 2009-Part 2
  14. Binding, C.; Gnoli, C.; Tudhope, D.: Migrating a complex classification scheme to the semantic web : expressing the Integrative Levels Classification using SKOS RDF (2021) 0.01
    0.0061980444 = product of:
      0.018594133 = sum of:
        0.018594133 = weight(_text_:to in 600) [ClassicSimilarity], result of:
          0.018594133 = score(doc=600,freq=10.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.22457743 = fieldWeight in 600, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=600)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose The Integrative Levels Classification (ILC) is a comprehensive "freely faceted" knowledge organization system not previously expressed as SKOS (Simple Knowledge Organization System). This paper reports and reflects on work converting the ILC to SKOS representation. Design/methodology/approach The design of the ILC representation and the various steps in the conversion to SKOS are described and located within the context of previous work considering the representation of complex classification schemes in SKOS. Various issues and trade-offs emerging from the conversion are discussed. The conversion implementation employed the STELETO transformation tool. Findings The ILC conversion captures some of the ILC facet structure by a limited extension beyond the SKOS standard. SPARQL examples illustrate how this extension could be used to create faceted, compound descriptors when indexing or cataloguing. Basic query patterns are provided that might underpin search systems. Possible routes for reducing complexity are discussed. Originality/value Complex classification schemes, such as the ILC, have features which are not straight forward to represent in SKOS and which extend beyond the functionality of the SKOS standard. The ILC's facet indicators are modelled as rdf:Property sub-hierarchies that accompany the SKOS RDF statements. The ILC's top-level fundamental facet relationships are modelled by extensions of the associative relationship - specialised sub-properties of skos:related. An approach for representing faceted compound descriptions in ILC and other faceted classification schemes is proposed.
  15. Jones, I.; Cunliffe, D.; Tudhope, D.: Natural language processing and knowledge organization systems as an aid to retrieval (2004) 0.01
    0.0061357515 = product of:
      0.018407254 = sum of:
        0.018407254 = weight(_text_:to in 2677) [ClassicSimilarity], result of:
          0.018407254 = score(doc=2677,freq=20.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.22232032 = fieldWeight in 2677, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2677)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper discusses research that employs methods from Natural Language Processing (NLP) in exploiting the intellectual resources of Knowledge Organization Systems (KOS), particularly in the retrieval of information. A technique for the disambiguation of homographs and nominal compounds in free text, where these are known ambiguous terms in the KOS itself, is described. The use of Roget's Thesaurus as an intermediary in the process is also reported. A short review of the relevant literature in the field is given. Design considerations, results and conclusions are presented from the implementation of a prototype system. The linguistic techniques are applied at two complementary levels, namely an a free text string used as an entry point to the KOS, and an the underlying controlled vocabulary itself.
    Content
    1. Introduction The need for research into the application of linguistic techniques in Information Retrieval (IR) in general, and a similar need in faceted Knowledge Organization Systems (KOS) has been indicated by various authors. Smeaton (1997) points out the inherent limitations of conventional approaches to IR based an "bags of words", mainly difficulties caused by lexical ambiguity in the words concerned, and goes an to suggest the possibility of using Natural Language Processing (NLP) in query formulation. Past experience with a faceted retrieval system highlighted the need for integrating the linguistic perspective in order to fully utilise the potential of a KOS (Tudhope et al." 2002). The present research seeks to address some of these needs in using NLP to improve the efficacy of KOS tools in query and retrieval systems. Syntactic parsing and part-of-speech tagging can substantially reduce lexical ambiguity through homograph disambiguation. Given the two strings "1 fable the motion" and "I put the motion an the fable", for instance, the parser used in this research clearly indicates that 'fable' in the first string is a verb, while 'table' in the second string is a noun, a distinction that would be missed in the "bag of words" approach. This syntactic disambiguation enables a more precise matching from free text to the controlled vocabulary of a KOS and vice versa. The use of a general linguistic resource, namely Roget's Thesaurus of English Words and Phrases (RTEWP), as an intermediary in this process, is investigated. The adaptation of the Link parser (Sleator & Temperley, 1993) to the purposes of the research is reported. The design and implementation of the early practical stages of the project are described, and the results of the initial experiments are presented and evaluated. Applications of the techniques developed are foreseen in the areas of query disambiguation, information retrieval and automatic indexing. In the first section of the paper a brief review of the literature and relevant current work in the field is presented. The second section includes reports an the development of algorithms, the construction of data sets and theoretical and experimental work undertaken to date. The third section evaluates the results obtained, and outlines directions for future research.
  16. Tudhope, D.; Taylor, C.: ¬A unified similarity coefficient for navigating through multi-dimensional information (1996) 0.01
    0.005761182 = product of:
      0.017283546 = sum of:
        0.017283546 = weight(_text_:to in 7460) [ClassicSimilarity], result of:
          0.017283546 = score(doc=7460,freq=6.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.20874833 = fieldWeight in 7460, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.046875 = fieldNorm(doc=7460)
      0.33333334 = coord(1/3)
    
    Abstract
    Describes an integrated approach to similarity coefficients for information spaces with multiple dimensions of different types of index term. Categorises applications of similarity coefficients underlying different navigation tools in hypermedia by type of term. Describes an implementation of a unified similarity coefficient based on work in numerical taxonomy, with illustrative scenarios from an experimental navigation via similarity tool for a prototype social history museum hypermedia system. The underlying architecture is based on a semantic approach, where semantic relationships can exist between index terms. This allows imprecise matching when comparing for similarity, with distance measures yielding a degree of match. A ranked list of matching items over several weighted dimensions is returned by the similarity navigation tool. The approach has the potential of allowing different access methods to multimedia data to be combined
  17. Binding, C.; Tudhope, D.: Integrating faceted structure into the search process (2004) 0.01
    0.005761182 = product of:
      0.017283546 = sum of:
        0.017283546 = weight(_text_:to in 2627) [ClassicSimilarity], result of:
          0.017283546 = score(doc=2627,freq=6.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.20874833 = fieldWeight in 2627, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.046875 = fieldNorm(doc=2627)
      0.33333334 = coord(1/3)
    
    Abstract
    The nature of search requirements is perceived to be changing, fuelled by a growing dissatisfaction with the marginal accuracy and often overwhelming quantity of results from simple keyword matching techniques. Traditional search interfaces fail to acknowledge and utilise the implicit underlying structure present within a typical keyword query. Faceted structure can (and should) perform a significant role in this area - acting as the basis for mediation between searcher and indexer, and guiding query formulation and reformulation by interactively educating the user about the native domain. This paper discusses the possible benefits of applying faceted knowledge organization systems to enhance query structure, query visualisation and the overall query process, drawing an the outcomes of a recently completed research project.
  18. Tudhope, D.; Binding, C.; Blocks, D.; Cuncliffe, D.: Representation and retrieval in faceted systems (2003) 0.01
    0.0055436995 = product of:
      0.016631098 = sum of:
        0.016631098 = weight(_text_:to in 2703) [ClassicSimilarity], result of:
          0.016631098 = score(doc=2703,freq=8.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.20086816 = fieldWeight in 2703, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2703)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper discusses two inter-related themes: the retrieval potential of faceted thesauri and XML representations of fundamental facets. Initial findings are discussed from the ongoing 'FACET' project, in collaboration with the National Museum of Science and Industry. The work discussed seeks to take advantage of the structure afforded by faceted systems for multi-term queries and flexible matching, focusing in this paper an the Art and Architecture Thesaurus. A multi-term matching function yields ranked results with partial matches via semantic term expansion, based an a measure of distance over the semantic index space formed by thesaurus relationships. Our intention is to drive the system from general representations and a common query structure and interface. To this end, we are developing an XML representation based an work by the Classification Research Group an fundamental facets or categories. The XML representation maps categories to particular thesauri and hierarchies. The system interface, which is configured by the mapping, incorporates a thesaurus browser with navigation history together with a term search facility and drag and drop query builder.
  19. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.01
    0.0055436995 = product of:
      0.016631098 = sum of:
        0.016631098 = weight(_text_:to in 2895) [ClassicSimilarity], result of:
          0.016631098 = score(doc=2895,freq=8.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.20086816 = fieldWeight in 2895, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.33333334 = coord(1/3)
    
    Abstract
    The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.
  20. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.01
    0.005431694 = product of:
      0.016295081 = sum of:
        0.016295081 = weight(_text_:to in 3948) [ClassicSimilarity], result of:
          0.016295081 = score(doc=3948,freq=12.0), product of:
            0.08279609 = queryWeight, product of:
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.045541126 = queryNorm
            0.19680978 = fieldWeight in 3948, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.818051 = idf(docFreq=19512, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.