Search (2 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Wissensrepräsentation"
  1. Wright, L.W.; Nardini, H.K.G.; Aronson, A.R.; Rindflesch, T.C.: Hierarchical concept indexing of full-text documents in the Unified Medical Language System Information sources Map (1999) 0.01
    0.0073997467 = product of:
      0.029598987 = sum of:
        0.029598987 = product of:
          0.059197973 = sum of:
            0.059197973 = weight(_text_:project in 2111) [ClassicSimilarity], result of:
              0.059197973 = score(doc=2111,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.27981415 = fieldWeight in 2111, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2111)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is further complicated by the need to search separately many disparate information resources. This research explores how these problems can be addressed by the combined use of 2 techniques: 1) natural language processing for automatic concept-based indexing of full text, and 2) methods for exploiting the structure and hierarchy of full-text documents. We describe methods for applying these techniques to a large collection of full-text documents drawn from the Health Services / Technology Assessment Text (HSTAT) database at the NLM and examine how this hierarchical concept indexing can assist both document- and source-level retrieval in the context of NLM's Information Source Map project
  2. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.00
    0.0049331645 = product of:
      0.019732658 = sum of:
        0.019732658 = product of:
          0.039465316 = sum of:
            0.039465316 = weight(_text_:project in 3948) [ClassicSimilarity], result of:
              0.039465316 = score(doc=3948,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.18654276 = fieldWeight in 3948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3948)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.