Search (38 results, page 1 of 2)

  • × language_ss:"e"
  • × theme_ss:"Konzeption und Anwendung des Prinzips Thesaurus"
  • × year_i:[2000 TO 2010}
  1. Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002) 0.11
    0.11420593 = product of:
      0.15227456 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 5226) [ClassicSimilarity], result of:
              0.023542227 = score(doc=5226,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 5226, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5226)
          0.25 = coord(1/4)
        0.056460675 = weight(_text_:term in 5226) [ClassicSimilarity], result of:
          0.056460675 = score(doc=5226,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.25776416 = fieldWeight in 5226, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5226)
        0.08992833 = weight(_text_:frequency in 5226) [ClassicSimilarity], result of:
          0.08992833 = score(doc=5226,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.32531026 = fieldWeight in 5226, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5226)
      0.75 = coord(3/4)
    
    Abstract
    Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list. Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.
  2. Losee, R.M.: Decisions in thesaurus construction and use (2007) 0.05
    0.051439807 = product of:
      0.10287961 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 924) [ClassicSimilarity], result of:
              0.028250674 = score(doc=924,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 924, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=924)
          0.25 = coord(1/4)
        0.09581695 = weight(_text_:term in 924) [ClassicSimilarity], result of:
          0.09581695 = score(doc=924,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.4374403 = fieldWeight in 924, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=924)
      0.5 = coord(2/4)
    
    Abstract
    A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in "breadcrumb navigation".
  3. McCulloch, E.: Thesauri: practical guidance for construction (2005) 0.04
    0.037407737 = product of:
      0.074815474 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 4724) [ClassicSimilarity], result of:
              0.028250674 = score(doc=4724,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 4724, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4724)
          0.25 = coord(1/4)
        0.06775281 = weight(_text_:term in 4724) [ClassicSimilarity], result of:
          0.06775281 = score(doc=4724,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 4724, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=4724)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - With the growing recognition that thesauri aid information retrieval, organisations are beginning to adopt, and in many cases, create thesauri. This paper offers some guidance on the construction process. Design/methodology/approach - An opinion piece with a practical focus, based on recent experiences gleaned from consultancy work. Findings - A number of steps can be taken to ensure any thesaurus under construction is fit for purpose. Due consideration is therefore given to aspects such as term selection, structure and notation, thesauri standards, software and Web display issues, thesauri evaluation and maintenance. This paper also notes that creating new subject schemes from scratch, however attractive, contributes to the plethora of terminologies currently in existence and can limit user searching within particular contexts. The decision to create a "new" thesaurus should therefore be taken carefully and observance of standards is paramount. Practical implications - This paper offers advice to assist practitioners in the development of thesauri. Originality/value - Useful guidance for those practitioners new to the area of thesaurus construction is provided, together with an overview of selected key processes involved in the construction of a thesaurus.
  4. Li, K.W.; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis (2005) 0.04
    0.036016613 = product of:
      0.07203323 = sum of:
        0.008155267 = product of:
          0.032621067 = sum of:
            0.032621067 = weight(_text_:based in 3391) [ClassicSimilarity], result of:
              0.032621067 = score(doc=3391,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.2306343 = fieldWeight in 3391, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3391)
          0.25 = coord(1/4)
        0.06387796 = weight(_text_:term in 3391) [ClassicSimilarity], result of:
          0.06387796 = score(doc=3391,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.29162687 = fieldWeight in 3391, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.03125 = fieldNorm(doc=3391)
      0.5 = coord(2/4)
    
    Abstract
    For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.
  5. Spiteri, L.F.: Word association testing and thesaurus construction : a pilot study (2005) 0.03
    0.034227468 = product of:
      0.13690987 = sum of:
        0.13690987 = weight(_text_:term in 5216) [ClassicSimilarity], result of:
          0.13690987 = score(doc=5216,freq=6.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.62504494 = fieldWeight in 5216, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5216)
      0.25 = coord(1/4)
    
    Abstract
    This pilot study examines the use of word association testing in the derivation of user-derived descriptors, descriptor hierarchies, and categories of inter-term relationships for the purpose of thesaurus construction. Ten participants, who were students, were presented with a test-bed of 15 domain-specific stimulus terms and were asked to provide as many response words as they could for each stimulus term and to describe how the response and stimulus terms are inter-related. The word association test was successful in generating a significant number of word pairs and facet indicators that could be used to display inter-term relationships in thesauri.
  6. Wang, J.: Automatic thesaurus development : term extraction from title metadata (2006) 0.03
    0.031173116 = product of:
      0.06234623 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 5063) [ClassicSimilarity], result of:
              0.023542227 = score(doc=5063,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 5063, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5063)
          0.25 = coord(1/4)
        0.056460675 = weight(_text_:term in 5063) [ClassicSimilarity], result of:
          0.056460675 = score(doc=5063,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.25776416 = fieldWeight in 5063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5063)
      0.5 = coord(2/4)
    
    Abstract
    The application of thesauri in networked environments is seriously hampered by the challenges of introducing new concepts and terminology into the formal controlled vocabulary, which is critical for enhancing its retrieval capability. The author describes an automated process of adding new terms to thesauri as entry vocabulary by analyzing the association between words/phrases extracted from bibliographic titles and subject descriptors in the metadata record (subject descriptors are terms assigned from controlled vocabularies of thesauri to describe the subjects of the objects [e.g., books, articles] represented by the metadata records). The investigated approach uses a corpus of metadata for scientific and technical (S&T) publications in which the titles contain substantive words for key topics. The three steps of the method are (a) extracting words and phrases from the title field of the metadata; (b) applying a method to identify and select the specific and meaningful keywords based on the associated controlled vocabulary terms from the thesaurus used to catalog the objects; and (c) inserting selected keywords into the thesaurus as new terms (most of them are in hierarchical relationships with the existing concepts), thereby updating the thesaurus with new terminology that is being used in the literature. The effectiveness of the method was demonstrated by an experiment with the Chinese Classification Thesaurus (CCT) and bibliographic data in China Machine-Readable Cataloging Record (MARC) format (CNMARC) provided by Peking University Library. This approach is equally effective in large-scale collections and in other languages.
  7. Shiri, A.A.; Revie, C.; Chowdhury, G.: Thesaurus-assisted search term selection and query expansion : a review of user-centred studies (2002) 0.03
    0.029337829 = product of:
      0.117351316 = sum of:
        0.117351316 = weight(_text_:term in 1330) [ClassicSimilarity], result of:
          0.117351316 = score(doc=1330,freq=6.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.5357528 = fieldWeight in 1330, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1330)
      0.25 = coord(1/4)
    
    Abstract
    This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing an studies that adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken an the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summarises the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections: first, studies an thesaurus-aided search term selection; and second, studies dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach.
  8. Tudhope, D.; Alani, H.; Jones, C.: Augmenting thesaurus relationships : possibilities for retrieval (2001) 0.02
    0.024448192 = product of:
      0.09779277 = sum of:
        0.09779277 = weight(_text_:term in 1520) [ClassicSimilarity], result of:
          0.09779277 = score(doc=1520,freq=6.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.44646066 = fieldWeight in 1520, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1520)
      0.25 = coord(1/4)
    
    Abstract
    This paper discusses issues concerning the augmentation of thesaurus relationships, in light of new application possibilities for retrieval. We first discuss a case study that explored the retrieval potential of an augmented set of thesaurus relationships by specialising standard relationships into richer subtypes, in particular hierarchical geographical containment and the associative relationship. We then locate this work in a broader context by reviewing various attempts to build taxonomies of thesaurus relationships, and conclude by discussing the feasibility of hierarchically augmenting the core set of thesaurus relationships, particularly the associative relationship. We discuss the possibility of enriching the specification and semantics of Related Term (RT relationships), while maintaining compatibility with traditional thesauri via a limited hierarchical extension of the associative (and hierarchical) relationships. This would be facilitated by distinguishing the type of term from the (sub)type of relationship and explicitly specifying semantic categories for terms following a faceted approach. We first illustrate how hierarchical spatial relationships can be used to provide more flexible retrieval for queries incorporating place names in applications employing online gazetteers and geographical thesauri. We then employ a set of experimental scenarios to investigate key issues affecting use of the associative (RT) thesaurus relationships in semantic distance measures. Previous work has noted the potential of RTs in thesaurus search aids but also the problem of uncontrolled expansion of query term sets. Results presented in this paper suggest the potential for taking account of the hierarchical context of an RT link and specialisations of the RT relationship
  9. Shiri, A.: Topic familiarity and its effects on term selection and browsing in a thesaurus-enhanced search environment (2005) 0.02
    0.019961866 = product of:
      0.07984746 = sum of:
        0.07984746 = weight(_text_:term in 613) [ClassicSimilarity], result of:
          0.07984746 = score(doc=613,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.3645336 = fieldWeight in 613, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=613)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - To evaluate the extent to which familiarity with search topics affects the ways in which users select and browse search terms in a thesaurus-enhanced search setting. Design/methodology/approach - An experimental methodology was adopted to study users' search behaviour in an operational information retrieval environment. Findings - Topic familiarity and subject knowledge influence some search and interaction behaviours. Searches involving moderately and very familiar topics were associated with browsing around twice as many thesaurus terms as was the case for unfamiliar topics. Research limitations/implications - Some search behaviours such as thesaurus browsing and term selection could be used as an indication of user levels of topic familiarity. Practical implications - The results of this study provide design implications as to how to develop personalized search interfaces where users with varying levels of familiarity with search topics can carry out searches. Originality/value - This paper establishes the importance of topic familiarity characteristics and the effects of those characteristics on users' interaction with search interfaces enhanced with semantic tools such as thesauri.
  10. ISO 25964 Thesauri and interoperability with other vocabularies (2008) 0.02
    0.018703869 = product of:
      0.037407737 = sum of:
        0.0035313342 = product of:
          0.014125337 = sum of:
            0.014125337 = weight(_text_:based in 1169) [ClassicSimilarity], result of:
              0.014125337 = score(doc=1169,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.09986758 = fieldWeight in 1169, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1169)
          0.25 = coord(1/4)
        0.033876404 = weight(_text_:term in 1169) [ClassicSimilarity], result of:
          0.033876404 = score(doc=1169,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.1546585 = fieldWeight in 1169, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1169)
      0.5 = coord(2/4)
    
    Abstract
    T.1: Today's thesauri are mostly electronic tools, having moved on from the paper-based era when thesaurus standards were first developed. They are built and maintained with the support of software and need to integrate with other software, such as search engines and content management systems. Whereas in the past thesauri were designed for information professionals trained in indexing and searching, today there is a demand for vocabularies that untrained users will find to be intuitive. ISO 25964 makes the transition needed for the world of electronic information management. However, part 1 retains the assumption that human intellect is usually involved in the selection of indexing terms and in the selection of search terms. If both the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved. This is the main principle underlying thesaurus design, even though a thesaurus built for human users may also be applied in situations where computers make the choices. Efficient exchange of data is a vital component of thesaurus management and exploitation. Hence the inclusion in this standard of recommendations for exchange formats and protocols. Adoption of these will facilitate interoperability between thesaurus management systems and the other computer applications, such as indexing and retrieval systems, that will utilize the data. Thesauri are typically used in post-coordinate retrieval systems, but may also be applied to hierarchical directories, pre-coordinate indexes and classification systems. Increasingly, thesaurus applications need to mesh with others, such as automatic categorization schemes, free-text search systems, etc. Part 2 of ISO 25964 describes additional types of structured vocabulary and gives recommendations to enable interoperation of the vocabularies at all stages of the information storage and retrieval process.
  11. Fischer, D.H.: Converting a thesaurus to OWL : Notes on the paper "The National Cancer Institute's Thesaurus and Ontology" (2004) 0.02
    0.01795325 = product of:
      0.0359065 = sum of:
        0.00823978 = product of:
          0.03295912 = sum of:
            0.03295912 = weight(_text_:based in 2362) [ClassicSimilarity], result of:
              0.03295912 = score(doc=2362,freq=8.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23302436 = fieldWeight in 2362, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=2362)
          0.25 = coord(1/4)
        0.027666723 = product of:
          0.055333447 = sum of:
            0.055333447 = weight(_text_:assessment in 2362) [ClassicSimilarity], result of:
              0.055333447 = score(doc=2362,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.2134973 = fieldWeight in 2362, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=2362)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The paper analysed here is a kind of position paper. In order to get a better under-standing of the reported work I used the retrieval interface of the thesaurus, the so-called NCI DTS Browser accessible via the Web3, and I perused the cited OWL file4 with numerous "Find" and "Find next" string searches. In addition the file was im-ported into Protégé 2000, Release 2.0, with OWL Plugin 1.0 and Racer Plugin 1.7.14. At the end of the paper's introduction the authors say: "In the following sections, this paper will describe the terminology development process at NCI, and the issues associated with converting a description logic based nomenclature to a semantically rich OWL ontology." While I will not deal with the first part, i.e. the terminology development process at NCI, I do not see the thesaurus as a description logic based nomenclature, or its cur-rent state and conversion already result in a "rich" OWL ontology. What does "rich" mean here? According to my view there is a great quantity of concepts and links but a very poor description logic structure which enables inferences. And what does the fol-lowing really mean, which is said a few lines previously: "Although editors have defined a number of named ontologic relations to support the description-logic based structure of the Thesaurus, additional relation-ships are considered for inclusion as required to support dependent applications."
    According to my findings several relations available in the thesaurus query interface as "roles", are not used, i.e. there are not yet any assertions with them. And those which are used do not contribute to complete concept definitions of concepts which represent thesaurus main entries. In other words: The authors claim to already have a "description logic based nomenclature", where there is not yet one which deserves that title by being much more than a thesaurus with strict subsumption and additional inheritable semantic links. In the last section of the paper the authors say: "The most time consuming process in this conversion was making a careful analysis of the Thesaurus to understand the best way to translate it into OWL." "For other conversions, these same types of distinctions and decisions must be made. The expressive power of a proprietary encoding can vary widely from that in OWL or RDF. Understanding the original semantics and engineering a solution that most closely duplicates it is critical for creating a useful and accu-rate ontology." My question is: What decisions were made and are they exemplary, can they be rec-ommended as "the best way"? I raise strong doubts with respect to that, and I miss more profound discussions of the issues at stake. The following notes are dedicated to a critical description and assessment of the results of that conversion activity. They are written in a tutorial style more or less addressing students, but myself being a learner especially in the field of medical knowledge representation I do not speak "ex cathedra".
  12. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.02
    0.016956761 = product of:
      0.033913523 = sum of:
        0.011652809 = product of:
          0.046611235 = sum of:
            0.046611235 = weight(_text_:based in 156) [ClassicSimilarity], result of:
              0.046611235 = score(doc=156,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.3295462 = fieldWeight in 156, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.25 = coord(1/4)
        0.022260714 = product of:
          0.04452143 = sum of:
            0.04452143 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.04452143 = score(doc=156,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
    Date
    8. 3.2007 19:55:22
  13. Milstead, J.L.: Standards for relationships between subject indexing terms (2001) 0.02
    0.016938202 = product of:
      0.06775281 = sum of:
        0.06775281 = weight(_text_:term in 1148) [ClassicSimilarity], result of:
          0.06775281 = score(doc=1148,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 1148, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1148)
      0.25 = coord(1/4)
    
    Abstract
    Relationships between the terms in thesauri and Indexes are the subject of national and international standards. The standards for thesauri enumerate and provide criteria for three basic types of relationship: equivalence, hierarchical, and associative. Standards and guidelines for indexes draw an the thesaurus standards to provide less detailed guidance for showing relationships between the terms used in an Index. The international standard for multilingual thesauri adds recommendations for assuring equal treatment of the languages of a thesaurus. The present standards were developed when lookup and search were essentially manual, and the value of the kinds of relationships has never been determined. It is not clear whether users understand or can use the distinctions between kinds of relationships. On the other hand, sophisticated text analysis systems may be able both to assist with development of more powerful term relationship schemes and to use the relationships to improve retrieval.
  14. Shiri, A.A.; Revie, C.: End-user interaction with thesauri : an evaluation of cognitive overlap in search term selection (2004) 0.02
    0.016938202 = product of:
      0.06775281 = sum of:
        0.06775281 = weight(_text_:term in 2658) [ClassicSimilarity], result of:
          0.06775281 = score(doc=2658,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 2658, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2658)
      0.25 = coord(1/4)
    
  15. Owens, L.A.; Cochrane, P.A.: Thesaurus evaluation (2004) 0.01
    0.013833362 = product of:
      0.055333447 = sum of:
        0.055333447 = product of:
          0.11066689 = sum of:
            0.11066689 = weight(_text_:assessment in 4856) [ClassicSimilarity], result of:
              0.11066689 = score(doc=4856,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.4269946 = fieldWeight in 4856, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4856)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    The process of thesaurus evaluation can enhance the value of a thesaurus in terms of usability, scope, precision and recall. Structural, formative, observational and comparative evaluation techniques are explained along with specific examples of their use. These methods of evaluation can be applied in the assessment of an existing thesaurus or the construction of a new thesaurus. The history of thesauri since 1960, the development of national and international standards, and sources of evaluative literature are also discussed.
  16. Aitchison, J.; Dextre Clarke, S.G.: ¬The Thesaurus : a historical viewpoint, with a look to the future (2004) 0.01
    0.013071639 = product of:
      0.026143279 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 5005) [ClassicSimilarity], result of:
              0.028250674 = score(doc=5005,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 5005, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5005)
          0.25 = coord(1/4)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 5005) [ClassicSimilarity], result of:
              0.038161222 = score(doc=5005,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 5005, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5005)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    After a period of experiment and evolution in the 1950s and 1960s, a fairly standard format for thesauri was established with the publication of the influential Thesaurus of Engineering and Scientific Terms (TEST) in 1967. This and other early thesauri relied primarily an the presentation of terms in alphabetical order. The value of a classified presentation was subsequently realised, and in particular the technique of facet analysis has profoundly influenced thesaurus evolution. Thesaurofacet and the Art & Architecture Thesaurus have acted as models for two distinct breeds of thesaurus using faceted displays of terms. As of the 1990s, the expansion of end-user access to vast networked resources is imposing further requirements an the style and structure of controlled vocabularies. The international standards for thesauri, first conceived in a print-based era, are badly in need of updating. Work is in hand in the UK and the USA to revise and develop standards in support of electronic thesauri.
    Date
    22. 9.2007 15:46:13
  17. Bagheri, M.: Development of thesauri in Iran (2006) 0.01
    0.013071639 = product of:
      0.026143279 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 260) [ClassicSimilarity], result of:
              0.028250674 = score(doc=260,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 260, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=260)
          0.25 = coord(1/4)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 260) [ClassicSimilarity], result of:
              0.038161222 = score(doc=260,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 260, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=260)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The need for Persian thesauri became apparent during the late 1960s with the advent of documentation centres in Iran. The first Persian controlled vocabulary was published by IRANDOC in 1977. Other centres worked on translations of existing thesauri, but it was soon realised that these efforts did not meet the needs of the centres. After the Islamic revolution in 1979, the foundation of new centres intensified the need for Persian thesauri, especially in the fields of history and government documents. Also, during the Iran-Iraq war, Iranian research centres produced reports in scientific and technical fields, both to support military requirements and to meet society's needs. In order to provide a comprehensive thesaurus, the Council of Scientific Research of Iran approved a project for the compilation of such a work. Nowadays, 12 Persian thesauri are available and others are being prepared, based on the literary corpus and conformity with characteristics of Iranian culture.
    Source
    Indexer. 25(2006) no.1, S.19-22
  18. Moreira, A.; Alvarenga, L.; Paiva Oliveira, A. de: "Thesaurus" and "Ontology" : a study of the definitions found in the computer and information science literature (2004) 0.01
    0.011977118 = product of:
      0.047908474 = sum of:
        0.047908474 = weight(_text_:term in 3726) [ClassicSimilarity], result of:
          0.047908474 = score(doc=3726,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.21872015 = fieldWeight in 3726, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3726)
      0.25 = coord(1/4)
    
    Abstract
    This is a comparative analysis of the term ontology, used in the computer science domain, with the term thesaurus, used in the information science domain. The aim of the study is to establish the main convergence points of these two knowledge representation instruments and to point out their differences. In order to fulfill this goal an analytical-Synthetic method was applied to extract the meaning underlying each of the selected definitions of the instruments. The definitions were obtained from texts weIl accepted by the research community from both areas. The definitions were applied to a KWIC system in order to rotate the terms that were examined qualitatively and quantitatively. We concluded that thesauri and ontologies operate at the same knowledge level, the epistemological level, in spite of different origins and purposes.
  19. Hill, L.: New Protocols for Gazetteer and Thesaurus Services (2002) 0.01
    0.011292135 = product of:
      0.04516854 = sum of:
        0.04516854 = weight(_text_:term in 1206) [ClassicSimilarity], result of:
          0.04516854 = score(doc=1206,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.20621133 = fieldWeight in 1206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.03125 = fieldNorm(doc=1206)
      0.25 = coord(1/4)
    
    Abstract
    The Alexandria Digital Library Project announces the online publication of two protocols to support querying and response interactions using distributed services: one for gazetteers and one for thesauri. These protocols have been developed for our own purposes and also to support the general interoperability of gazetteers and thesauri on the web. See <http://www.alexandria.ucsb.edu/~gjanee/gazetteer/> and <http://www.alexandria.ucsb.edu/~gjanee/thesaurus/>. For the gazetteer protocol, we have provided a page of test forms that can be used to experiment with the operational functions of the protocol in accessing two gazetteers: the ADL Gazetteer and the ESRI Gazetteer (ESRI has participated in the development of the gazetteer protocol). We are in the process of developing a thesaurus server and a simple client to demonstrate the use of the thesaurus protocol. We are soliciting comments on both protocols. Please remember that we are seeking protocols that are essentially "simple" and easy to implement and that support basic operations - they should not duplicate all of the functions of specialized gazetteer and thesaurus interfaces. We continue to discuss ways of handling various issues and to further develop the protocols. For the thesaurus protocol, outstanding issues include the treatment of multilingual thesauri and the degree to which the language attribute should be supported; whether the Scope Note element should be changed to a repeatable Note element; the best way to handle the hierarchical report for multi-hierarchies where portions of the hierarchy are repeated; and whether support for searching by term identifiers is redundant and unnecessary given that the terms themselves are unique within a thesaurus. For the gazetteer protocol, we continue to work on validation of query and report XML documents and on implementing the part of the protocol designed to support the submission of new entries to a gazetteer. We would like to encourage open discussion of these protocols through the NKOS discussion list (see the NKOS webpage at <http://nkos.slis.kent.edu/>) and the CGGR-L discussion list that focuses on gazetteer development (see ADL Gazetteer Development page at <http://www.alexandria.ucsb.edu/gazetteer>).
  20. Qin, J.; Paling, S.: Converting a controlled vocabulary into an ontology : the case of GEM (2001) 0.01
    0.0095403055 = product of:
      0.038161222 = sum of:
        0.038161222 = product of:
          0.076322444 = sum of:
            0.076322444 = weight(_text_:22 in 3895) [ClassicSimilarity], result of:
              0.076322444 = score(doc=3895,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.46428138 = fieldWeight in 3895, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=3895)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    24. 8.2005 19:20:22

Types

  • a 29
  • el 7
  • m 2
  • n 2
  • s 1
  • x 1
  • More… Less…