Search (19 results, page 1 of 1)

Jones, I.; Cunliffe, D.; Tudhope, D.: Natural language processing and knowledge organization systems as an aid to retrieval (2004) 0.03
```
0.028729646 = product of:
  0.05745929 = sum of:
    0.04766385 = weight(_text_:processing in 2677) [ClassicSimilarity], result of:
      0.04766385 = score(doc=2677,freq=6.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.27113777 = fieldWeight in 2677, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
    0.009795443 = product of:
      0.029386329 = sum of:
        0.029386329 = weight(_text_:29 in 2677) [ClassicSimilarity], result of:
          0.029386329 = score(doc=2677,freq=4.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.19237353 = fieldWeight in 2677, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2677)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)
```
Abstract

This paper discusses research that employs methods from Natural Language Processing (NLP) in exploiting the intellectual resources of Knowledge Organization Systems (KOS), particularly in the retrieval of information. A technique for the disambiguation of homographs and nominal compounds in free text, where these are known ambiguous terms in the KOS itself, is described. The use of Roget's Thesaurus as an intermediary in the process is also reported. A short review of the relevant literature in the field is given. Design considerations, results and conclusions are presented from the implementation of a prototype system. The linguistic techniques are applied at two complementary levels, namely an a free text string used as an entry point to the KOS, and an the underlying controlled vocabulary itself.

Content

1. Introduction The need for research into the application of linguistic techniques in Information Retrieval (IR) in general, and a similar need in faceted Knowledge Organization Systems (KOS) has been indicated by various authors. Smeaton (1997) points out the inherent limitations of conventional approaches to IR based an "bags of words", mainly difficulties caused by lexical ambiguity in the words concerned, and goes an to suggest the possibility of using Natural Language Processing (NLP) in query formulation. Past experience with a faceted retrieval system highlighted the need for integrating the linguistic perspective in order to fully utilise the potential of a KOS (Tudhope et al." 2002). The present research seeks to address some of these needs in using NLP to improve the efficacy of KOS tools in query and retrieval systems. Syntactic parsing and part-of-speech tagging can substantially reduce lexical ambiguity through homograph disambiguation. Given the two strings "1 fable the motion" and "I put the motion an the fable", for instance, the parser used in this research clearly indicates that 'fable' in the first string is a verb, while 'table' in the second string is a noun, a distinction that would be missed in the "bag of words" approach. This syntactic disambiguation enables a more precise matching from free text to the controlled vocabulary of a KOS and vice versa. The use of a general linguistic resource, namely Roget's Thesaurus of English Words and Phrases (RTEWP), as an intermediary in this process, is investigated. The adaptation of the Link parser (Sleator & Temperley, 1993) to the purposes of the research is reported. The design and implementation of the early practical stages of the project are described, and the results of the initial experiments are presented and evaluated. Applications of the techniques developed are foreseen in the areas of query disambiguation, information retrieval and automatic indexing. In the first section of the paper a brief review of the literature and relevant current work in the field is presented. The second section includes reports an the development of algorithms, the construction of data sets and theoretical and experimental work undertaken to date. The third section evaluates the results obtained, and outlines directions for future research.

Date

29. 8.2004 19:29:56
Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.03
```
0.026196454 = product of:
  0.052392907 = sum of:
    0.044476993 = weight(_text_:processing in 3948) [ClassicSimilarity], result of:
      0.044476993 = score(doc=3948,freq=4.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.2530092 = fieldWeight in 3948, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.03125 = fieldNorm(doc=3948)
    0.007915913 = product of:
      0.023747738 = sum of:
        0.023747738 = weight(_text_:29 in 3948) [ClassicSimilarity], result of:
          0.023747738 = score(doc=3948,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.15546128 = fieldWeight in 3948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)
```
Abstract

Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.

Date

29. 8.2010 12:03:40
Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.02
```
0.022430437 = product of:
  0.044860873 = sum of:
    0.03931248 = weight(_text_:processing in 2895) [ClassicSimilarity], result of:
      0.03931248 = score(doc=2895,freq=2.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.22363065 = fieldWeight in 2895, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2895)
    0.005548396 = product of:
      0.016645188 = sum of:
        0.016645188 = weight(_text_:science in 2895) [ClassicSimilarity], result of:
          0.016645188 = score(doc=2895,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 2895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)
```
Abstract

The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1138-1152

Tudhope, D.; Taylor, C.: Navigation via similarity (1997) 0.01

0.011793743 = product of:
  0.04717497 = sum of:
    0.04717497 = weight(_text_:processing in 155) [ClassicSimilarity], result of:
      0.04717497 = score(doc=155,freq=2.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.26835677 = fieldWeight in 155, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=155)
  0.25 = coord(1/4)

Source: Information processing and management. 33(1997) no.2, S.233-242

Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.01

0.009212566 = product of:
  0.036850262 = sum of:
    0.036850262 = product of:
      0.05527539 = sum of:
        0.019974224 = weight(_text_:science in 1347) [ClassicSimilarity], result of:
          0.019974224 = score(doc=1347,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.17461908 = fieldWeight in 1347, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=1347)
        0.035301168 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
          0.035301168 = score(doc=1347,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.23214069 = fieldWeight in 1347, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1347)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)

Date: 22. 8.2014 17:12:54
Source: Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1901-1916

Tudhope, D.; Blocks, D.; Cunliffe, D.; Binding, C.: Query expansion via conceptual distance in thesaurus indexed collections (2006) 0.01
```
0.007721644 = product of:
  0.030886576 = sum of:
    0.030886576 = product of:
      0.046329863 = sum of:
        0.016645188 = weight(_text_:science in 2215) [ClassicSimilarity], result of:
          0.016645188 = score(doc=2215,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 2215, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2215)
        0.029684676 = weight(_text_:29 in 2215) [ClassicSimilarity], result of:
          0.029684676 = score(doc=2215,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.19432661 = fieldWeight in 2215, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2215)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

Purpose - The purpose of this paper is to explore query expansion via conceptual distance in thesaurus indexed collections Design/methodology/approach - An extract of the National Museum of Science and Industry's collections database, indexed with the Getty Art and Architecture Thesaurus (AAT), was the dataset for the research. The system architecture and algorithms for semantic closeness and the matching function are outlined. Standalone and web interfaces are described and formative qualitative user studies are discussed. One user session is discussed in detail, together with a scenario based on a related public inquiry. Findings are set in context of the literature on thesaurus-based query expansion. This paper discusses the potential of query expansion techniques using the semantic relationships in a faceted thesaurus. Findings - Thesaurus-assisted retrieval systems have potential for multi-concept descriptors, permitting very precise queries and indexing. However, indexer and searcher may differ in terminology judgments and there may not be any exactly matching results. The integration of semantic closeness in the matching function permits ranked results for multi-concept queries in thesaurus-indexed applications. An in-memory representation of the thesaurus semantic network allows a combination of automatic and interactive control of expansion and control of expansion on individual query terms. Originality/value - The application of semantic expansion to browsing may be useful in interface options where thesaurus structure is hidden.

Date

30. 7.2011 16:07:29
Matthews, B.; Jones, C.; Puzon, B.; Moon, J.; Tudhope, D.; Golub, K.; Nielsen, M.L.: ¬An evaluation of enhancing social tagging with a knowledge organization system (2010) 0.01
```
0.007721644 = product of:
  0.030886576 = sum of:
    0.030886576 = product of:
      0.046329863 = sum of:
        0.016645188 = weight(_text_:science in 4171) [ClassicSimilarity], result of:
          0.016645188 = score(doc=4171,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 4171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4171)
        0.029684676 = weight(_text_:29 in 4171) [ClassicSimilarity], result of:
          0.029684676 = score(doc=4171,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.19432661 = fieldWeight in 4171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4171)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

Purpose - Traditional subject indexing and classification are considered infeasible in many digital collections. This paper seeks to investigate ways of enhancing social tagging via knowledge organization systems, with a view to improving the quality of tags for increased information discovery and retrieval performance. Design/methodology/approach - Enhanced tagging interfaces were developed for exemplar online repositories, and trials were undertaken with author and reader groups to evaluate the effectiveness of tagging augmented with control vocabulary for subject indexing of papers in online repositories. Findings - The results showed that using a knowledge organisation system to augment tagging does appear to increase the effectiveness of non-specialist users (that is, without information science training) in subject indexing. Research limitations/implications - While limited by the size and scope of the trials undertaken, these results do point to the usefulness of a mixed approach in supporting the subject indexing of online resources. Originality/value - The value of this work is as a guide to future developments in the practical support for resource indexing in online repositories.

Date

29. 8.2010 11:39:20

Tudhope, D.: Knowledge Organization System Services : brief review of NKOS activities and possibility of KOS registries (2007) 0.01

0.005883528 = product of:
  0.023534112 = sum of:
    0.023534112 = product of:
      0.070602335 = sum of:
        0.070602335 = weight(_text_:22 in 100) [ClassicSimilarity], result of:
          0.070602335 = score(doc=100,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.46428138 = fieldWeight in 100, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=100)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 22. 9.2007 15:41:14

Tudhope, D.; Hodge, G.: Terminology registries (2007) 0.00

0.0049029402 = product of:
  0.019611761 = sum of:
    0.019611761 = product of:
      0.058835283 = sum of:
        0.058835283 = weight(_text_:22 in 539) [ClassicSimilarity], result of:
          0.058835283 = score(doc=539,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.38690117 = fieldWeight in 539, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=539)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 26.12.2011 13:22:07

Binding, C.; Tudhope, D.: Integrating faceted structure into the search process (2004) 0.00

0.0029684675 = product of:
  0.01187387 = sum of:
    0.01187387 = product of:
      0.03562161 = sum of:
        0.03562161 = weight(_text_:29 in 2627) [ClassicSimilarity], result of:
          0.03562161 = score(doc=2627,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.23319192 = fieldWeight in 2627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=2627)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 29. 8.2004 9:08:02

Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.00

0.0024737231 = product of:
  0.009894893 = sum of:
    0.009894893 = product of:
      0.029684676 = sum of:
        0.029684676 = weight(_text_:29 in 2300) [ClassicSimilarity], result of:
          0.029684676 = score(doc=2300,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.19432661 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Source: Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro

Golub, K.; Soergel, D.; Buchanan, G.; Tudhope, D.; Lykke, M.; Hiom, D.: ¬A framework for evaluating automatic indexing or classification in the context of retrieval (2016) 0.00

0.001961654 = product of:
  0.007846616 = sum of:
    0.007846616 = product of:
      0.023539849 = sum of:
        0.023539849 = weight(_text_:science in 3311) [ClassicSimilarity], result of:
          0.023539849 = score(doc=3311,freq=4.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.20579056 = fieldWeight in 3311, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3311)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Series: Advances in information science
Source: Journal of the Association for Information Science and Technology. 67(2016) no.1, S.3-16

Tudhope, D.; Taylor, C.: ¬A unified similarity coefficient for navigating through multi-dimensional information (1996) 0.00

0.0016645187 = product of:
  0.006658075 = sum of:
    0.006658075 = product of:
      0.019974224 = sum of:
        0.019974224 = weight(_text_:science in 7460) [ClassicSimilarity], result of:
          0.019974224 = score(doc=7460,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.17461908 = fieldWeight in 7460, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=7460)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Source: Global complexity: information, chaos and control. Proceedings of the 59th Annual Meeting of the American Society for Information Science, ASIS'96, Baltimore, Maryland, 21-24 Oct 1996. Ed.: S. Hardin

Blocks, D.; Cunliffe, D.; Tudhope, D.: ¬A reference model for user-system interaction in thesaurus-based searching (2006) 0.00

0.0016645187 = product of:
  0.006658075 = sum of:
    0.006658075 = product of:
      0.019974224 = sum of:
        0.019974224 = weight(_text_:science in 202) [ClassicSimilarity], result of:
          0.019974224 = score(doc=202,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.17461908 = fieldWeight in 202, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=202)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Source: Journal of the American Society for Information Science and Technology. 57(2006) no.12, S.1655-1665

Binding, C.; Tudhope, D.: Terminology Web services (2010) 0.00
```
0.0016645187 = product of:
  0.006658075 = sum of:
    0.006658075 = product of:
      0.019974224 = sum of:
        0.019974224 = weight(_text_:science in 4067) [ClassicSimilarity], result of:
          0.019974224 = score(doc=4067,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.17461908 = fieldWeight in 4067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=4067)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

Controlled terminologies such as classification schemes, name authorities, and thesauri have long been the domain of the library and information science community. Although historically there have been initiatives towards library style classification of web resources, there remain significant problems with searching and quality judgement of online content. Terminology services can play a key role in opening up access to these valuable resources. By exposing controlled terminologies via a web service, organisations maintain data integrity and version control, whilst motivating external users to design innovative ways to present and utilise their data. We introduce terminology web services and review work in the area. We describe the approaches taken in establishing application programming interfaces (API) and discuss the comparative benefits of a dedicated terminology web service versus general purpose programming languages. We discuss experiences at Glamorgan in creating terminology web services and associated client interface components, in particular for the archaeology domain in the STAR (Semantic Technologies for Archaeological Resources) Project.
Tudhope, D.; Binding, C.; Blocks, D.; Cuncliffe, D.: Representation and retrieval in faceted systems (2003) 0.00
```
0.001387099 = product of:
  0.005548396 = sum of:
    0.005548396 = product of:
      0.016645188 = sum of:
        0.016645188 = weight(_text_:science in 2703) [ClassicSimilarity], result of:
          0.016645188 = score(doc=2703,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 2703, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2703)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

This paper discusses two inter-related themes: the retrieval potential of faceted thesauri and XML representations of fundamental facets. Initial findings are discussed from the ongoing 'FACET' project, in collaboration with the National Museum of Science and Industry. The work discussed seeks to take advantage of the structure afforded by faceted systems for multi-term queries and flexible matching, focusing in this paper an the Art and Architecture Thesaurus. A multi-term matching function yields ranked results with partial matches via semantic term expansion, based an a measure of distance over the semantic index space formed by thesaurus relationships. Our intention is to drive the system from general representations and a common query structure and interface. To this end, we are developing an XML representation based an work by the Classification Research Group an fundamental facets or categories. The XML representation maps categories to particular thesauri and hierarchies. The system interface, which is configured by the mapping, incorporates a thesaurus browser with navigation history together with a term search facility and drag and drop query builder.
Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: Compound descriptors in context : a matching function for classifications and thesauri (2002) 0.00
```
0.001387099 = product of:
  0.005548396 = sum of:
    0.005548396 = product of:
      0.016645188 = sum of:
        0.016645188 = weight(_text_:science in 3179) [ClassicSimilarity], result of:
          0.016645188 = score(doc=3179,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 3179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3179)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This paper discusses a matching function for compound descriptors, or multi-concept subject headings, that does not rely an exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based an a measure of semantic closeness between terms, which has the potential to help with recall problems. The work reported is part of the ongoing FACET project in collaboration with the National Museum of Science and Industry and its collections database. The architecture of the prototype system and its Interface are outlined. The matching problem for compound descriptors is reviewed and the FACET implementation described. Results are discussed from scenarios using the faceted Getty Art and Architecture Thesaurus. We argue that automatic traversal of thesaurus relationships can augment the user's browsing possibilities. The techniques can be applied both to unstructured multi-concept subject headings and potentially to more syntactically structured strings. The notion of a focus term is used by the matching function to model AAT modified descriptors (noun phrases). The relevance of the approach to precoordinated indexing and matching faceted strings is discussed.
Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: FACET: thesaurus retrieval with semantic term expansion (2002) 0.00
```
0.0011096792 = product of:
  0.004438717 = sum of:
    0.004438717 = product of:
      0.01331615 = sum of:
        0.01331615 = weight(_text_:science in 175) [ClassicSimilarity], result of:
          0.01331615 = score(doc=175,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.11641272 = fieldWeight in 175, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.03125 = fieldNorm(doc=175)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This demonstration of a research prototype illustrates a matching function for compound descriptors, or multi-concept subject headings, that does not rely on exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based on a measure of semantic closeness between terms.The work is part of the EPSRC funded FACET project in collaboration with the UK National Museum of Science and Industry (NMSI) which includes the National Railway Museum. An export of NMSI's Collections Database is used as the dataset for the research. The J. Paul Getty Trust's Art and Architecture Thesaurus (AAT) is the main thesaurus in the project. The AAT is a widely used thesaurus (over 120,000 terms). Descriptors are organised in 7 facets representing separate conceptual classes of terms.The FACET application is a multi tiered architecture accessing a SQL Server database, with an OLE DB connection. The thesauri are stored as relational tables in the Server's database. However, a key component of the system is a parallel representation of the underlying semantic network as an in-memory structure of thesaurus concepts (corresponding to preferred terms). The structure models the hierarchical and associative interrelationships of thesaurus concepts via weighted poly-hierarchical links. Its primary purpose is real-time semantic expansion of query terms, achieved by a spreading activation semantic closeness algorithm. Queries with associated results are stored persistently using XML format data. A Visual Basic interface combines a thesaurus browser and an initial term search facility that takes into account equivalence relationships. Terms are dragged to a direct manipulation Query Builder which maintains the facet structure.

Tudhope, D.; Binding, C.: Toward terminology services : experiences with a pilot Web service thesaurus browser (2006) 0.00

0.0011096792 = product of:
  0.004438717 = sum of:
    0.004438717 = product of:
      0.01331615 = sum of:
        0.01331615 = weight(_text_:science in 1955) [ClassicSimilarity], result of:
          0.01331615 = score(doc=1955,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.11641272 = fieldWeight in 1955, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.03125 = fieldNorm(doc=1955)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Source: Bulletin of the American Society for Information Science and Technology. 33(2006) no.5, S.xx-xx

Search (19 results, page 1 of 1)

Authors

Years

Types

Themes