Search (10 results, page 1 of 1)

  • × author_ss:"Tudhope, D."
  1. Blocks, D.; Cunliffe, D.; Tudhope, D.: ¬A reference model for user-system interaction in thesaurus-based searching (2006) 0.03
    0.026169024 = product of:
      0.07850707 = sum of:
        0.07850707 = weight(_text_:reference in 202) [ClassicSimilarity], result of:
          0.07850707 = score(doc=202,freq=4.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.38140965 = fieldWeight in 202, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.046875 = fieldNorm(doc=202)
      0.33333334 = coord(1/3)
    
    Abstract
    The authors present a model of information searching in thesaurus-enhanced search systems, intended as a reference model for system developers. The model focuses on user-system interaction and charts the specific stages of searching an indexed collection with a thesaurus. It was developed based on literature, findings from empirical studies, and analysis of existing systems. The model describes in detail the entities, processes, and decisions when interacting with a search system augmented with a thesaurus. A basic search scenario illustrates this process through the model. Graphical and textual depictions of the model are complemented by a concise matrix representation for evaluation purposes. Potential problems at different stages of the search process are discussed, together with possibilities for system developers. The aim is to set out a framework of processes, decisions, and risks involved in thesaurus-based search, within which system developers can consider potential avenues for support.
  2. Binding, C.; Tudhope, D.: Improving interoperability using vocabulary linked data (2015) 0.02
    0.02180752 = product of:
      0.06542256 = sum of:
        0.06542256 = weight(_text_:reference in 2205) [ClassicSimilarity], result of:
          0.06542256 = score(doc=2205,freq=4.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.31784135 = fieldWeight in 2205, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2205)
      0.33333334 = coord(1/3)
    
    Abstract
    The concept of Linked Data has been an emerging theme within the computing and digital heritage areas in recent years. The growth and scale of Linked Data has underlined the need for greater commonality in concept referencing, to avoid local redefinition and duplication of reference resources. Achieving domain-wide agreement on common vocabularies would be an unreasonable expectation; however, datasets often already have local vocabulary resources defined, and so the prospects for large-scale interoperability can be substantially improved by creating alignment links from these local vocabularies out to common external reference resources. The ARIADNE project is undertaking large-scale integration of archaeology dataset metadata records, to create a cross-searchable research repository resource. Key to enabling this cross search will be the 'subject' metadata originating from multiple data providers, containing terms from multiple multilingual controlled vocabularies. This paper discusses various aspects of vocabulary mapping. Experience from the previous SENESCHAL project in the publication of controlled vocabularies as Linked Open Data is discussed, emphasizing the importance of unique URI identifiers for vocabulary concepts. There is a need to align legacy indexing data to the uniquely defined concepts and examples are discussed of SENESCHAL data alignment work. A case study for the ARIADNE project presents work on mapping between vocabularies, based on the Getty Art and Architecture Thesaurus as a central hub and employing an interactive vocabulary mapping tool developed for the project, which generates SKOS mapping relationships in JSON and other formats. The potential use of such vocabulary mappings to assist cross search over archaeological datasets from different countries is illustrated in a pilot experiment. The results demonstrate the enhanced opportunities for interoperability and cross searching that the approach offers.
  3. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.02
    0.015420245 = product of:
      0.046260733 = sum of:
        0.046260733 = weight(_text_:reference in 2895) [ClassicSimilarity], result of:
          0.046260733 = score(doc=2895,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.22474778 = fieldWeight in 2895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.33333334 = coord(1/3)
    
    Abstract
    The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.
  4. Tudhope, D.: Knowledge Organization System Services : brief review of NKOS activities and possibility of KOS registries (2007) 0.01
    0.013709504 = product of:
      0.041128512 = sum of:
        0.041128512 = product of:
          0.082257025 = sum of:
            0.082257025 = weight(_text_:22 in 100) [ClassicSimilarity], result of:
              0.082257025 = score(doc=100,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.46428138 = fieldWeight in 100, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=100)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 9.2007 15:41:14
  5. Tudhope, D.; Hodge, G.: Terminology registries (2007) 0.01
    0.011424588 = product of:
      0.034273762 = sum of:
        0.034273762 = product of:
          0.068547525 = sum of:
            0.068547525 = weight(_text_:22 in 539) [ClassicSimilarity], result of:
              0.068547525 = score(doc=539,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.38690117 = fieldWeight in 539, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=539)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    26.12.2011 13:22:07
  6. Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: FACET: thesaurus retrieval with semantic term expansion (2002) 0.01
    0.010547735 = product of:
      0.031643204 = sum of:
        0.031643204 = product of:
          0.06328641 = sum of:
            0.06328641 = weight(_text_:database in 175) [ClassicSimilarity], result of:
              0.06328641 = score(doc=175,freq=6.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.3094352 = fieldWeight in 175, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.03125 = fieldNorm(doc=175)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This demonstration of a research prototype illustrates a matching function for compound descriptors, or multi-concept subject headings, that does not rely on exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based on a measure of semantic closeness between terms.The work is part of the EPSRC funded FACET project in collaboration with the UK National Museum of Science and Industry (NMSI) which includes the National Railway Museum. An export of NMSI's Collections Database is used as the dataset for the research. The J. Paul Getty Trust's Art and Architecture Thesaurus (AAT) is the main thesaurus in the project. The AAT is a widely used thesaurus (over 120,000 terms). Descriptors are organised in 7 facets representing separate conceptual classes of terms.The FACET application is a multi tiered architecture accessing a SQL Server database, with an OLE DB connection. The thesauri are stored as relational tables in the Server's database. However, a key component of the system is a parallel representation of the underlying semantic network as an in-memory structure of thesaurus concepts (corresponding to preferred terms). The structure models the hierarchical and associative interrelationships of thesaurus concepts via weighted poly-hierarchical links. Its primary purpose is real-time semantic expansion of query terms, achieved by a spreading activation semantic closeness algorithm. Queries with associated results are stored persistently using XML format data. A Visual Basic interface combines a thesaurus browser and an initial term search facility that takes into account equivalence relationships. Terms are dragged to a direct manipulation Query Builder which maintains the facet structure.
  7. Tudhope, D.; Blocks, D.; Cunliffe, D.; Binding, C.: Query expansion via conceptual distance in thesaurus indexed collections (2006) 0.01
    0.0076121716 = product of:
      0.022836514 = sum of:
        0.022836514 = product of:
          0.045673028 = sum of:
            0.045673028 = weight(_text_:database in 2215) [ClassicSimilarity], result of:
              0.045673028 = score(doc=2215,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.2233156 = fieldWeight in 2215, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2215)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this paper is to explore query expansion via conceptual distance in thesaurus indexed collections Design/methodology/approach - An extract of the National Museum of Science and Industry's collections database, indexed with the Getty Art and Architecture Thesaurus (AAT), was the dataset for the research. The system architecture and algorithms for semantic closeness and the matching function are outlined. Standalone and web interfaces are described and formative qualitative user studies are discussed. One user session is discussed in detail, together with a scenario based on a related public inquiry. Findings are set in context of the literature on thesaurus-based query expansion. This paper discusses the potential of query expansion techniques using the semantic relationships in a faceted thesaurus. Findings - Thesaurus-assisted retrieval systems have potential for multi-concept descriptors, permitting very precise queries and indexing. However, indexer and searcher may differ in terminology judgments and there may not be any exactly matching results. The integration of semantic closeness in the matching function permits ranked results for multi-concept queries in thesaurus-indexed applications. An in-memory representation of the thesaurus semantic network allows a combination of automatic and interactive control of expansion and control of expansion on individual query terms. Originality/value - The application of semantic expansion to browsing may be useful in interface options where thesaurus structure is hidden.
  8. Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: Compound descriptors in context : a matching function for classifications and thesauri (2002) 0.01
    0.0076121716 = product of:
      0.022836514 = sum of:
        0.022836514 = product of:
          0.045673028 = sum of:
            0.045673028 = weight(_text_:database in 3179) [ClassicSimilarity], result of:
              0.045673028 = score(doc=3179,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.2233156 = fieldWeight in 3179, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3179)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This paper discusses a matching function for compound descriptors, or multi-concept subject headings, that does not rely an exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based an a measure of semantic closeness between terms, which has the potential to help with recall problems. The work reported is part of the ongoing FACET project in collaboration with the National Museum of Science and Industry and its collections database. The architecture of the prototype system and its Interface are outlined. The matching problem for compound descriptors is reviewed and the FACET implementation described. Results are discussed from scenarios using the faceted Getty Art and Architecture Thesaurus. We argue that automatic traversal of thesaurus relationships can augment the user's browsing possibilities. The techniques can be applied both to unstructured multi-concept subject headings and potentially to more syntactically structured strings. The notion of a focus term is used by the matching function to model AAT modified descriptors (noun phrases). The relevance of the approach to precoordinated indexing and matching faceted strings is discussed.
  9. Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.01
    0.006854752 = product of:
      0.020564256 = sum of:
        0.020564256 = product of:
          0.041128512 = sum of:
            0.041128512 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
              0.041128512 = score(doc=1347,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.23214069 = fieldWeight in 1347, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1347)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 8.2014 17:12:54
  10. Tudhope, D.; Binding, C.: Toward terminology services : experiences with a pilot Web service thesaurus browser (2006) 0.01
    0.006089737 = product of:
      0.018269211 = sum of:
        0.018269211 = product of:
          0.036538422 = sum of:
            0.036538422 = weight(_text_:database in 1955) [ClassicSimilarity], result of:
              0.036538422 = score(doc=1955,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.17865248 = fieldWeight in 1955, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1955)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Dublin Core recommends controlled terminology for the subject of a resource. Knowledge organization systems (KOS), such as classifications, gazetteers, taxonomies and thesauri, provide controlled vocabularies that organize and structure concepts for indexing, classifying, browsing and search. For example, a thesaurus employs a set of standard semantic relationships (ISO 2788, ISO 5964), and major thesauri have a large entry vocabulary of terms considered equivalent for retrieval purposes. Many KOS have been made available for Web-based access. However, they are often not fully integrated into indexing and search systems and the full potential for networked and programmatic access remains untapped. The lack of standardized access and interchange formats impedes wider use of KOS resources. We developed a Web demonstrator (www.comp.glam.ac.uk/~FACET/webdemo/) for the FACET project (www.comp.glam.ac.uk/~facet/facetproject.html) that explored thesaurus-based query expansion with the Getty Art and Architecture Thesaurus. A Web demonstrator was implemented via Active Server Pages (ASP) with server-side scripting and compiled server-side components for database access, and cascading style sheets for presentation. The browser-based interactive interface permits dynamic control of query term expansion. However, being based on a custom thesaurus representation and API, the techniques cannot be applied directly to thesauri in other formats on the Web. General programmatic access requires commonly agreed protocols, for example, building on Web and Grid services. The development of common KOS representation formats and service protocols are closely linked. Linda Hill and colleagues argued in 2002 for a general KOS service protocol from which protocols for specific types of KOS can be derived. Thus, in the future, a combination of thesaurus and query protocols might permit a thesaurus to be used with a choice of search tools on various kinds of databases. Service-oriented architectures bring an opportunity for moving toward a clearer separation of interface components from the underlying data sources. In our view, basing distributed protocol services on the atomic elements of thesaurus data structures and relationships is not necessarily the best approach because client operations that require multiple client-server calls would carry too much overhead. This would limit the interfaces that could be offered by applications following such a protocol. Advanced interactive interfaces require protocols that group primitive thesaurus data elements (via their relationships) into composites to achieve reasonable response.