Search (14 results, page 1 of 1)

  • × author_ss:"Tudhope, D."
  1. Khoo, M.J.; Ahn, J.-w.; Binding, C.; Jones, H.J.; Lin, X.; Massam, D.; Tudhope, D.: Augmenting Dublin Core digital library metadata with Dewey Decimal Classification (2015) 0.04
    0.03995543 = product of:
      0.07991086 = sum of:
        0.067270875 = weight(_text_:description in 2320) [ClassicSimilarity], result of:
          0.067270875 = score(doc=2320,freq=4.0), product of:
            0.23150103 = queryWeight, product of:
              4.64937 = idf(docFreq=1149, maxDocs=44218)
              0.04979191 = queryNorm
            0.29058564 = fieldWeight in 2320, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.64937 = idf(docFreq=1149, maxDocs=44218)
              0.03125 = fieldNorm(doc=2320)
        0.012639986 = product of:
          0.025279973 = sum of:
            0.025279973 = weight(_text_:access in 2320) [ClassicSimilarity], result of:
              0.025279973 = score(doc=2320,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.14979297 = fieldWeight in 2320, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2320)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The purpose of this paper is to describe a new approach to a well-known problem for digital libraries, how to search across multiple unrelated libraries with a single query. Design/methodology/approach - The approach involves creating new Dewey Decimal Classification terms and numbers from existing Dublin Core records. In total, 263,550 records were harvested from three digital libraries. Weighted key terms were extracted from the title, description and subject fields of each record. Ranked DDC classes were automatically generated from these key terms by considering DDC hierarchies via a series of filtering and aggregation stages. A mean reciprocal ranking evaluation compared a sample of 49 generated classes against DDC classes created by a trained librarian for the same records. Findings - The best results combined weighted key terms from the title, description and subject fields. Performance declines with increased specificity of DDC level. The results compare favorably with similar studies. Research limitations/implications - The metadata harvest required manual intervention and the evaluation was resource intensive. Future research will look at evaluation methodologies that take account of issues of consistency and ecological validity. Practical implications - The method does not require training data and is easily scalable. The pipeline can be customized for individual use cases, for example, recall or precision enhancing. Social implications - The approach can provide centralized access to information from multiple domains currently provided by individual digital libraries. Originality/value - The approach addresses metadata normalization in the context of web resources. The automatic classification approach accounts for matches within hierarchies, aggregating lower level matches to broader parents and thus approximates the practices of a human cataloger.
  2. Tudhope, D.; Nielsen, M.L.: Introduction to knowledge organization systems and services (2006) 0.04
    0.037629798 = product of:
      0.075259596 = sum of:
        0.05945961 = weight(_text_:description in 5913) [ClassicSimilarity], result of:
          0.05945961 = score(doc=5913,freq=2.0), product of:
            0.23150103 = queryWeight, product of:
              4.64937 = idf(docFreq=1149, maxDocs=44218)
              0.04979191 = queryNorm
            0.25684384 = fieldWeight in 5913, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.64937 = idf(docFreq=1149, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5913)
        0.015799982 = product of:
          0.031599965 = sum of:
            0.031599965 = weight(_text_:access in 5913) [ClassicSimilarity], result of:
              0.031599965 = score(doc=5913,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.18724121 = fieldWeight in 5913, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5913)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In a very real sense, this special issue on Knowledge Organization Systems and Services is concerned with new applications, new contexts and new twists of old themes and problems. We are concerned with diverse attempts to apply the outcomes of much work over the years in artificial subject languages and their intellectual structures to facilitate access to digital information in various settings. This issue has its origins in NKOS workshops, held over the last two years in Bath, Vienna, Madrid and Denver, although the majority of contributions resulted from an open call for papers, disseminated in October 2005. NKOS (http://nkos.slis.kent.edu/) is an informal network whose general aim is to enable knowledge organization systems (KOS) to act as networked information services (both machine-to-machine and human facing), supporting the description and retrieval of information resources on the Internet. Since 1997, there has been an NKOS workshop each year, at either the JCDL or ECDL conference (2005 saw an NKOS workshop at both conferences and also at Dublin Core). Previous NKOS-related special issues have appeared in the online Journal of Digital Information in 2001 and 2004 (Hill and Koch 2001, Tudhope and Koch 2004).
  3. Binding, C.; Tudhope, D.: Terminology Web services (2010) 0.03
    0.030063018 = product of:
      0.060126036 = sum of:
        0.04116606 = weight(_text_:26 in 4067) [ClassicSimilarity], result of:
          0.04116606 = score(doc=4067,freq=2.0), product of:
            0.17584132 = queryWeight, product of:
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.04979191 = queryNorm
            0.23410915 = fieldWeight in 4067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.046875 = fieldNorm(doc=4067)
        0.018959979 = product of:
          0.037919957 = sum of:
            0.037919957 = weight(_text_:access in 4067) [ClassicSimilarity], result of:
              0.037919957 = score(doc=4067,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.22468945 = fieldWeight in 4067, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4067)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Controlled terminologies such as classification schemes, name authorities, and thesauri have long been the domain of the library and information science community. Although historically there have been initiatives towards library style classification of web resources, there remain significant problems with searching and quality judgement of online content. Terminology services can play a key role in opening up access to these valuable resources. By exposing controlled terminologies via a web service, organisations maintain data integrity and version control, whilst motivating external users to design innovative ways to present and utilise their data. We introduce terminology web services and review work in the area. We describe the approaches taken in establishing application programming interfaces (API) and discuss the comparative benefits of a dedicated terminology web service versus general purpose programming languages. We discuss experiences at Glamorgan in creating terminology web services and associated client interface components, in particular for the archaeology domain in the STAR (Semantic Technologies for Archaeological Resources) Project.
    Date
    6. 1.2011 19:26:14
  4. Golub, K.; Tudhope, D.; Zeng, M.L.; Zumer, M.: Terminology registries for knowledge organization systems : functionality, use, and attributes (2014) 0.02
    0.023525903 = product of:
      0.09410361 = sum of:
        0.09410361 = sum of:
          0.053626917 = weight(_text_:access in 1347) [ClassicSimilarity], result of:
            0.053626917 = score(doc=1347,freq=4.0), product of:
              0.16876608 = queryWeight, product of:
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.04979191 = queryNorm
              0.31775886 = fieldWeight in 1347, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.046875 = fieldNorm(doc=1347)
          0.040476695 = weight(_text_:22 in 1347) [ClassicSimilarity], result of:
            0.040476695 = score(doc=1347,freq=2.0), product of:
              0.17436278 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04979191 = queryNorm
              0.23214069 = fieldWeight in 1347, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1347)
      0.25 = coord(1/4)
    
    Abstract
    Terminology registries (TRs) are a crucial element of the infrastructure required for resource discovery services, digital libraries, Linked Data, and semantic interoperability generally. They can make the content of knowledge organization systems (KOS) available both for human and machine access. The paper describes the attributes and functionality for a TR, based on a review of published literature, existing TRs, and a survey of experts. A domain model based on user tasks is constructed and a set of core metadata elements for use in TRs is proposed. Ideally, the TR should allow searching as well as browsing for a KOS, matching a user's search while also providing information about existing terminology services, accessible to both humans and machines. The issues surrounding metadata for KOS are also discussed, together with the rationale for different aspects and the importance of a core set of KOS metadata for future machine-based access; a possible core set of metadata elements is proposed. This is dealt with in terms of practical experience and in relation to the Dublin Core Application Profile.
    Date
    22. 8.2014 17:12:54
  5. Souza, R.R.; Tudhope, D.; Almeida, M.B.: ¬The KOS spectra : a tentative typology of knowledge organization systems (2010) 0.01
    0.012006768 = product of:
      0.048027072 = sum of:
        0.048027072 = weight(_text_:26 in 3523) [ClassicSimilarity], result of:
          0.048027072 = score(doc=3523,freq=2.0), product of:
            0.17584132 = queryWeight, product of:
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.04979191 = queryNorm
            0.27312735 = fieldWeight in 3523, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3523)
      0.25 = coord(1/4)
    
    Source
    Paradigms and conceptual systems in knowledge organization: Proceedings of the Eleventh International ISKO Conference, 23-26 February 2010 Rome, Italy. Edited by Claudio Gnoli and Fulvio Mazzocchi
  6. Tudhope, D.; Binding, C.: Still quite popular after all those years : the continued relevance of the information retrieval thesaurus (2016) 0.01
    0.010291515 = product of:
      0.04116606 = sum of:
        0.04116606 = weight(_text_:26 in 2908) [ClassicSimilarity], result of:
          0.04116606 = score(doc=2908,freq=2.0), product of:
            0.17584132 = queryWeight, product of:
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.04979191 = queryNorm
            0.23410915 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5315237 = idf(docFreq=3516, maxDocs=44218)
              0.046875 = fieldNorm(doc=2908)
      0.25 = coord(1/4)
    
    Date
    26. 4.2016 12:44:32
  7. Tudhope, D.: Knowledge Organization System Services : brief review of NKOS activities and possibility of KOS registries (2007) 0.01
    0.010119174 = product of:
      0.040476695 = sum of:
        0.040476695 = product of:
          0.08095339 = sum of:
            0.08095339 = weight(_text_:22 in 100) [ClassicSimilarity], result of:
              0.08095339 = score(doc=100,freq=2.0), product of:
                0.17436278 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04979191 = queryNorm
                0.46428138 = fieldWeight in 100, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=100)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 9.2007 15:41:14
  8. Binding, C.; Tudhope, D.: KOS at your service : Programmatic access to knowledge organisation systems (2004) 0.01
    0.009479989 = product of:
      0.037919957 = sum of:
        0.037919957 = product of:
          0.075839914 = sum of:
            0.075839914 = weight(_text_:access in 1342) [ClassicSimilarity], result of:
              0.075839914 = score(doc=1342,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.4493789 = fieldWeight in 1342, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1342)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
  9. Tudhope, D.; Hodge, G.: Terminology registries (2007) 0.01
    0.008432645 = product of:
      0.03373058 = sum of:
        0.03373058 = product of:
          0.06746116 = sum of:
            0.06746116 = weight(_text_:22 in 539) [ClassicSimilarity], result of:
              0.06746116 = score(doc=539,freq=2.0), product of:
                0.17436278 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04979191 = queryNorm
                0.38690117 = fieldWeight in 539, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=539)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    26.12.2011 13:22:07
  10. Tudhope, D.; Binding, C.: Toward terminology services : experiences with a pilot Web service thesaurus browser (2006) 0.01
    0.007065967 = product of:
      0.028263869 = sum of:
        0.028263869 = product of:
          0.056527738 = sum of:
            0.056527738 = weight(_text_:access in 1955) [ClassicSimilarity], result of:
              0.056527738 = score(doc=1955,freq=10.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.33494726 = fieldWeight in 1955, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1955)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Dublin Core recommends controlled terminology for the subject of a resource. Knowledge organization systems (KOS), such as classifications, gazetteers, taxonomies and thesauri, provide controlled vocabularies that organize and structure concepts for indexing, classifying, browsing and search. For example, a thesaurus employs a set of standard semantic relationships (ISO 2788, ISO 5964), and major thesauri have a large entry vocabulary of terms considered equivalent for retrieval purposes. Many KOS have been made available for Web-based access. However, they are often not fully integrated into indexing and search systems and the full potential for networked and programmatic access remains untapped. The lack of standardized access and interchange formats impedes wider use of KOS resources. We developed a Web demonstrator (www.comp.glam.ac.uk/~FACET/webdemo/) for the FACET project (www.comp.glam.ac.uk/~facet/facetproject.html) that explored thesaurus-based query expansion with the Getty Art and Architecture Thesaurus. A Web demonstrator was implemented via Active Server Pages (ASP) with server-side scripting and compiled server-side components for database access, and cascading style sheets for presentation. The browser-based interactive interface permits dynamic control of query term expansion. However, being based on a custom thesaurus representation and API, the techniques cannot be applied directly to thesauri in other formats on the Web. General programmatic access requires commonly agreed protocols, for example, building on Web and Grid services. The development of common KOS representation formats and service protocols are closely linked. Linda Hill and colleagues argued in 2002 for a general KOS service protocol from which protocols for specific types of KOS can be derived. Thus, in the future, a combination of thesaurus and query protocols might permit a thesaurus to be used with a choice of search tools on various kinds of databases. Service-oriented architectures bring an opportunity for moving toward a clearer separation of interface components from the underlying data sources. In our view, basing distributed protocol services on the atomic elements of thesaurus data structures and relationships is not necessarily the best approach because client operations that require multiple client-server calls would carry too much overhead. This would limit the interfaces that could be offered by applications following such a protocol. Advanced interactive interfaces require protocols that group primitive thesaurus data elements (via their relationships) into composites to achieve reasonable response.
  11. Tudhope, D.; Taylor, C.: ¬A unified similarity coefficient for navigating through multi-dimensional information (1996) 0.00
    0.0047399946 = product of:
      0.018959979 = sum of:
        0.018959979 = product of:
          0.037919957 = sum of:
            0.037919957 = weight(_text_:access in 7460) [ClassicSimilarity], result of:
              0.037919957 = score(doc=7460,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.22468945 = fieldWeight in 7460, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7460)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Describes an integrated approach to similarity coefficients for information spaces with multiple dimensions of different types of index term. Categorises applications of similarity coefficients underlying different navigation tools in hypermedia by type of term. Describes an implementation of a unified similarity coefficient based on work in numerical taxonomy, with illustrative scenarios from an experimental navigation via similarity tool for a prototype social history museum hypermedia system. The underlying architecture is based on a semantic approach, where semantic relationships can exist between index terms. This allows imprecise matching when comparing for similarity, with distance measures yielding a degree of match. A ranked list of matching items over several weighted dimensions is returned by the similarity navigation tool. The approach has the potential of allowing different access methods to multimedia data to be combined
  12. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.00
    0.0039499956 = product of:
      0.015799982 = sum of:
        0.015799982 = product of:
          0.031599965 = sum of:
            0.031599965 = weight(_text_:access in 2918) [ClassicSimilarity], result of:
              0.031599965 = score(doc=2918,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.18724121 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
  13. Tudhope, D.; Binding, C.: Mapping between linked data vocabularies in ARIADNE (2015) 0.00
    0.0039499956 = product of:
      0.015799982 = sum of:
        0.015799982 = product of:
          0.031599965 = sum of:
            0.031599965 = weight(_text_:access in 2250) [ClassicSimilarity], result of:
              0.031599965 = score(doc=2250,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.18724121 = fieldWeight in 2250, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2250)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Semantic Enrichment Enabling Sustainability of Archaeological Links (SENESCHAL) was a project coordinated by the Hypermedia Research Unit at the University of South Wales. The project aims included widening access to key vocabulary resources. National cultural heritage thesauri and vocabularies are used by both national organizations and local authority Historic Environment Records and could potentially act as vocabulary hubs for the Web of Data. Following completion, a set of prominent UK archaeological thesauri and vocabularies is now freely available as Linked Open Data (LOD) via http://www.heritagedata.org - together with open source web services and user interface controls. This presentation will reflect on work done to date for the ARIADNE FP7 infrastructure project (http://www.ariadne-infrastructure.eu) mapping between archaeological vocabularies in different languages and the utility of a hub architecture. The poly-hierarchical structure of the Getty Art & Architecture Thesaurus (AAT) was extracted for use as an example mediating structure to interconnect various multilingual vocabularies originating from ARIADNE data providers. Vocabulary resources were first converted to a common concept-based format (SKOS) and the concepts were then manually mapped to nodes of the extracted AAT structure using some judgement on the meaning of terms and scope notes. Results are presented along with reflections on the wider application to existing European archaeological vocabularies and associated online datasets.
  14. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.00
    0.0031599966 = product of:
      0.012639986 = sum of:
        0.012639986 = product of:
          0.025279973 = sum of:
            0.025279973 = weight(_text_:access in 3948) [ClassicSimilarity], result of:
              0.025279973 = score(doc=3948,freq=2.0), product of:
                0.16876608 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.04979191 = queryNorm
                0.14979297 = fieldWeight in 3948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3948)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.