-
Qin, J.; Paling, S.: Converting a controlled vocabulary into an ontology : the case of GEM (2001)
0.04
0.03535254 = product of:
0.07070508 = sum of:
0.07070508 = sum of:
0.009399767 = weight(_text_:a in 3895) [ClassicSimilarity], result of:
0.009399767 = score(doc=3895,freq=4.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.2161963 = fieldWeight in 3895, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.09375 = fieldNorm(doc=3895)
0.06130531 = weight(_text_:22 in 3895) [ClassicSimilarity], result of:
0.06130531 = score(doc=3895,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.46428138 = fieldWeight in 3895, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=3895)
0.5 = coord(1/2)
- Date
- 24. 8.2005 19:20:22
- Type
- a
-
Tudhope, D.; Hodge, G.: Terminology registries (2007)
0.03
0.028313313 = product of:
0.056626625 = sum of:
0.056626625 = sum of:
0.0055388655 = weight(_text_:a in 539) [ClassicSimilarity], result of:
0.0055388655 = score(doc=539,freq=2.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.12739488 = fieldWeight in 539, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.078125 = fieldNorm(doc=539)
0.05108776 = weight(_text_:22 in 539) [ClassicSimilarity], result of:
0.05108776 = score(doc=539,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.38690117 = fieldWeight in 539, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.078125 = fieldNorm(doc=539)
0.5 = coord(1/2)
- Abstract
- A discussion on current initiatives regarding terminology registries.
- Date
- 26.12.2011 13:22:07
-
Dextre Clarke, S.G.: Thesaural relationships (2001)
0.02
0.021757921 = product of:
0.043515842 = sum of:
0.043515842 = sum of:
0.007754412 = weight(_text_:a in 1149) [ClassicSimilarity], result of:
0.007754412 = score(doc=1149,freq=8.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.17835285 = fieldWeight in 1149, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=1149)
0.03576143 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
0.03576143 = score(doc=1149,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.2708308 = fieldWeight in 1149, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=1149)
0.5 = coord(1/2)
- Abstract
- A thesaurus in the controlled vocabulary environment is a tool designed to support effective infonnation retrieval (IR) by guiding indexers and searchers consistently to choose the same terms for expressing a given concept or combination of concepts. Terms in the thesaurus are linked by relationships of three well-known types: equivalence, hierarchical, and associative. The functions and properties of these three basic types and some subcategories are described, as well as some additional relationship types conunonly found in thesauri. Progressive automation of IR processes and the capability for simultaneous searching of vast networked resources are creating some pressures for change in the categorization and consistency of relationships.
- Date
- 22. 9.2007 15:45:57
- Type
- a
-
Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005)
0.02
0.021238474 = product of:
0.04247695 = sum of:
0.04247695 = sum of:
0.006715518 = weight(_text_:a in 156) [ClassicSimilarity], result of:
0.006715518 = score(doc=156,freq=6.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.1544581 = fieldWeight in 156, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=156)
0.03576143 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
0.03576143 = score(doc=156,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.2708308 = fieldWeight in 156, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=156)
0.5 = coord(1/2)
- Abstract
- The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
- Date
- 8. 3.2007 19:55:22
- Type
- a
-
Nielsen, M.L.: Thesaurus construction : key issues and selected readings (2004)
0.02
0.020622315 = product of:
0.04124463 = sum of:
0.04124463 = sum of:
0.0054831975 = weight(_text_:a in 5006) [ClassicSimilarity], result of:
0.0054831975 = score(doc=5006,freq=4.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.12611452 = fieldWeight in 5006, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=5006)
0.03576143 = weight(_text_:22 in 5006) [ClassicSimilarity], result of:
0.03576143 = score(doc=5006,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.2708308 = fieldWeight in 5006, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=5006)
0.5 = coord(1/2)
- Abstract
- The purpose of this selected bibliography is to introduce issues and problems in relation to thesaurus construction and to present a set of readings that may be used in practical thesaurus design. The concept of thesaurus is discussed, the purpose of the thesaurus and how the concept has evolved over the years according to new IR technologies. Different approaches to thesaurus construction are introduced, and readings dealing with specific problems and developments in the collection, formation and organisation of thesaurus concepts and terms are presented. Primarily manual construction methods are discussed, but the bibliography also refers to research about techniques for automatic thesaurus construction.
- Date
- 18. 5.2006 20:06:22
- Type
- a
-
Dextre Clarke, S.G.: Evolution towards ISO 25964 : an international standard with guidelines for thesauri and other types of controlled vocabulary (2007)
0.02
0.01981932 = product of:
0.03963864 = sum of:
0.03963864 = sum of:
0.003877206 = weight(_text_:a in 749) [ClassicSimilarity], result of:
0.003877206 = score(doc=749,freq=2.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.089176424 = fieldWeight in 749, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=749)
0.03576143 = weight(_text_:22 in 749) [ClassicSimilarity], result of:
0.03576143 = score(doc=749,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.2708308 = fieldWeight in 749, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=749)
0.5 = coord(1/2)
- Date
- 8.12.2007 19:25:22
- Type
- a
-
Aitchison, J.; Dextre Clarke, S.G.: ¬The Thesaurus : a historical viewpoint, with a look to the future (2004)
0.02
0.019722667 = product of:
0.039445333 = sum of:
0.039445333 = sum of:
0.008792677 = weight(_text_:a in 5005) [ClassicSimilarity], result of:
0.008792677 = score(doc=5005,freq=14.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.20223314 = fieldWeight in 5005, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=5005)
0.030652655 = weight(_text_:22 in 5005) [ClassicSimilarity], result of:
0.030652655 = score(doc=5005,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.23214069 = fieldWeight in 5005, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.046875 = fieldNorm(doc=5005)
0.5 = coord(1/2)
- Abstract
- After a period of experiment and evolution in the 1950s and 1960s, a fairly standard format for thesauri was established with the publication of the influential Thesaurus of Engineering and Scientific Terms (TEST) in 1967. This and other early thesauri relied primarily an the presentation of terms in alphabetical order. The value of a classified presentation was subsequently realised, and in particular the technique of facet analysis has profoundly influenced thesaurus evolution. Thesaurofacet and the Art & Architecture Thesaurus have acted as models for two distinct breeds of thesaurus using faceted displays of terms. As of the 1990s, the expansion of end-user access to vast networked resources is imposing further requirements an the style and structure of controlled vocabularies. The international standards for thesauri, first conceived in a print-based era, are badly in need of updating. Work is in hand in the UK and the USA to revise and develop standards in support of electronic thesauri.
- Date
- 22. 9.2007 15:46:13
- Type
- a
-
Bagheri, M.: Development of thesauri in Iran (2006)
0.02
0.018649647 = product of:
0.037299294 = sum of:
0.037299294 = sum of:
0.006646639 = weight(_text_:a in 260) [ClassicSimilarity], result of:
0.006646639 = score(doc=260,freq=8.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.15287387 = fieldWeight in 260, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=260)
0.030652655 = weight(_text_:22 in 260) [ClassicSimilarity], result of:
0.030652655 = score(doc=260,freq=2.0), product of:
0.13204344 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.037706986 = queryNorm
0.23214069 = fieldWeight in 260, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.046875 = fieldNorm(doc=260)
0.5 = coord(1/2)
- Abstract
- The need for Persian thesauri became apparent during the late 1960s with the advent of documentation centres in Iran. The first Persian controlled vocabulary was published by IRANDOC in 1977. Other centres worked on translations of existing thesauri, but it was soon realised that these efforts did not meet the needs of the centres. After the Islamic revolution in 1979, the foundation of new centres intensified the need for Persian thesauri, especially in the fields of history and government documents. Also, during the Iran-Iraq war, Iranian research centres produced reports in scientific and technical fields, both to support military requirements and to meet society's needs. In order to provide a comprehensive thesaurus, the Council of Scientific Research of Iran approved a project for the compilation of such a work. Nowadays, 12 Persian thesauri are available and others are being prepared, based on the literary corpus and conformity with characteristics of Iranian culture.
- Source
- Indexer. 25(2006) no.1, S.19-22
- Type
- a
-
Moreira, A.; Alvarenga, L.; Paiva Oliveira, A. de: "Thesaurus" and "Ontology" : a study of the definitions found in the computer and information science literature (2004)
0.00
0.0025268705 = product of:
0.005053741 = sum of:
0.005053741 = product of:
0.010107482 = sum of:
0.010107482 = weight(_text_:a in 3726) [ClassicSimilarity], result of:
0.010107482 = score(doc=3726,freq=74.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.23247388 = fieldWeight in 3726, product of:
8.602325 = tf(freq=74.0), with freq of:
74.0 = termFreq=74.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0234375 = fieldNorm(doc=3726)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- This is a comparative analysis of the term ontology, used in the computer science domain, with the term thesaurus, used in the information science domain. The aim of the study is to establish the main convergence points of these two knowledge representation instruments and to point out their differences. In order to fulfill this goal an analytical-Synthetic method was applied to extract the meaning underlying each of the selected definitions of the instruments. The definitions were obtained from texts weIl accepted by the research community from both areas. The definitions were applied to a KWIC system in order to rotate the terms that were examined qualitatively and quantitatively. We concluded that thesauri and ontologies operate at the same knowledge level, the epistemological level, in spite of different origins and purposes.
- Content
- "Thesaurus" definitions taken from the information science literature "A thesaurus is a controlled vocabulary arranged in a known order and structured so that equivalence, homographic, hierarchical, and associative relationships among terms are displayed clearly and identified by standardized relationship indicators that are employed reciprocally." (ANSI/NISO Z39-19-1993) "Thesaurus is a specialized, normalized, postcoordinate language used for documentaries means, where the linguistic elements that composes it - single or composed terms - are related among themselves syntactically and semantically." (Translated into English by the authors from the original in Portuguese: Currás 1995, 88.) "[...] an authority file, which can lead the user from one concept to another via various heuristic or intuitive paths." (Howerton 1965 apud Gilchrist 1971, 5) " [...] is a lexical authority list, without notation, which differs from an alphabetical subject heading list in that the lexical units, being smaller, are more amenable to post-coordinate indexing." (Gilchrist 1971,2) [...] "a dynamic controlled vocabulary of terms related semantically and by generic relation covering a specific knowledge domain." (Translated into English by the authors from the original in Portuguese: UNESCO 1973, 6.) [...] "a terminological control device used in the translation of the natural language of the documents, from the indexers or from the users in a more restricted system language (documentation language, information language)." (Translated into English by the authors from the original in Portuguese: UNESCO 1973,6.)
"Ontologies" definitions taken from the computer science literature "[...] ontology is a representation vocabulary, often specialized to some domain or subject matter." (Chandrasekaran et al. 1999, 1) "[...] ontology is sometimes used to refer to a body of knowledge describing some domain, typically a commonsense knowledge domain, using a representation vocabulary." (Chandrasekaran et al. 1999, 1) "An ontology is a declarative model of the terms and relationships in a domain." (Eriksson et al. 1994, 1) " [...] an ontology is the (unspecified) conceptual system which we may assume to underlie a particular knowledge base." (Guarino and Giaretta 1995, 1) Ontology as a representation of a conceptual system via a logical theory". (Guarino and Giaretta 1995, 1) "An ontology is an explicit specification of a conceptualization." (Gruber 1993, 1) "[...] An ontology is a formal description of entities and their properties, relationships, constraints, behaviors." (Gruninger and Fox 1995, 1) "An ontology is set of terms, associated with definitions in natural language and, if possible, using formal relations and constraints, about some domain of interest ..." (Hovy 1998, 2) "Fach Ontology is a set of terms of interest in a particular information domain, expressed using DL ..." (Mena et al. 1996, 3) "[...] An ontology is a hierarchically structured set of terms for describing a domain that can be used as a skeletal foundation for a knowledge base." (Swartout et al. 1996, 1) "An ontology may take a variety of forms, but necessarily it will include a vocabulary of terms and some specification of their meaning." (Uschold 1996,3) "Ontologies are agreements about shared conceptualizations." (Uschold and Grunninger 1996, 6) "[...] a vocabulary of terms and a specification of their relationships." (Wiederhold 1994, 6)
- Type
- a
-
Lee, M.; Baillie, S.; Dell'Oro, J.: TML: a Thesaural Markpup Language (200?)
0.00
0.0024924895 = product of:
0.004984979 = sum of:
0.004984979 = product of:
0.009969958 = sum of:
0.009969958 = weight(_text_:a in 1622) [ClassicSimilarity], result of:
0.009969958 = score(doc=1622,freq=18.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.22931081 = fieldWeight in 1622, product of:
4.2426405 = tf(freq=18.0), with freq of:
18.0 = termFreq=18.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=1622)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Thesauri are used to provide controlled vocabularies for resource classification. Their use can greatly assist document discovery because thesauri man date a consistent shared terminology for describing documents. A particular thesauras classifies documents according to an information community's needs. As a result, there are many different thesaural schemas. This has led to a proliferation of schema-specific thesaural systems. In our research, we exploit schematic regularities to design a generic thesaural ontology and specfiy it as a markup language. The language provides a common representational framework in which to encode the idiosyncrasies of specific thesauri. This approach has several advantages: it offers consistent syntax and semantics in which to express thesauri; it allows general purpose thesaural applications to leverage many thesauri; and it supports a single thesaural user interface by which information communities can consistently organise, score and retrieve electronic documents.
-
Shearer, J.R.: ¬A practical exercise in building a thesaurus (2004)
0.00
0.002477056 = product of:
0.004954112 = sum of:
0.004954112 = product of:
0.009908224 = sum of:
0.009908224 = weight(_text_:a in 4857) [ClassicSimilarity], result of:
0.009908224 = score(doc=4857,freq=10.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.22789092 = fieldWeight in 4857, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0625 = fieldNorm(doc=4857)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- A nine-stage procedure to build a thesaurus systematically is presented. Each stage offers exercises to put the theory into practice, using agriculture as the sample topic area. Model solutions are given and discussed.
- Type
- a
-
Eckert, K.; Pfeffer, M.; Stuckenschmidt, H.: Assessing thesaurus-based annotations for semantic search applications (2008)
0.00
0.002374294 = product of:
0.004748588 = sum of:
0.004748588 = product of:
0.009497176 = sum of:
0.009497176 = weight(_text_:a in 1528) [ClassicSimilarity], result of:
0.009497176 = score(doc=1528,freq=12.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.21843673 = fieldWeight in 1528, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=1528)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Statistical methods for automated document indexing are becoming an alternative to the manual assignment of keywords. We argue that the quality of the thesaurus used as a basis for indexing in regard to its ability to adequately cover the contents to be indexed and as a basis for the specific indexing method used is of crucial importance in automatic indexing. We present an interactive tool for thesaurus evaluation that is based on a combination of statistical measures and appropriate visualisation techniques that supports the detection of potential problems in a thesaurus. We describe the methods used and show that the tool supports the detection and correction of errors, leading to a better indexing result.
- Type
- a
-
Naumis Pena, C.: Evaluation of educational thesauri (2006)
0.00
0.002374294 = product of:
0.004748588 = sum of:
0.004748588 = product of:
0.009497176 = sum of:
0.009497176 = weight(_text_:a in 2257) [ClassicSimilarity], result of:
0.009497176 = score(doc=2257,freq=12.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.21843673 = fieldWeight in 2257, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=2257)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- For years, Mexico has had a distance learning system backed by television-signal-transmitted videos. The change to digital and computer transmission demands organizing the information system and its subject contents through a thesaurus. To prepare the thesaurus, an evaluation of existing thesauri and standards for data exchange was carried out, aimed at retrieving subject contents and scheduling broadcasting. Methodology for evaluating thesauri was proposed, compared with a virtual educational platform and a basic structure for setting up the information system was recommended.
- Source
- Knowledge organization for a global learning society: Proceedings of the 9th International ISKO Conference, 4-7 July 2006, Vienna, Austria. Hrsg.: G. Budin, C. Swertz u. K. Mitgutsch
- Type
- a
-
Losee, R.M.: Decisions in thesaurus construction and use (2007)
0.00
0.0021981692 = product of:
0.0043963385 = sum of:
0.0043963385 = product of:
0.008792677 = sum of:
0.008792677 = weight(_text_:a in 924) [ClassicSimilarity], result of:
0.008792677 = score(doc=924,freq=14.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.20223314 = fieldWeight in 924, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=924)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in "breadcrumb navigation".
- Type
- a
-
Assem, M. van; Malaisé, V.; Miles, A.; Schreiber, G.: ¬A method to convert thesauri to SKOS (2006)
0.00
0.0021981692 = product of:
0.0043963385 = sum of:
0.0043963385 = product of:
0.008792677 = sum of:
0.008792677 = weight(_text_:a in 4642) [ClassicSimilarity], result of:
0.008792677 = score(doc=4642,freq=14.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.20223314 = fieldWeight in 4642, product of:
3.7416575 = tf(freq=14.0), with freq of:
14.0 = termFreq=14.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=4642)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Thesauri can be useful resources for indexing and retrieval on the Semantic Web, but often they are not published in RDF/OWL. To convert thesauri to RDF for use in Semantic Web applications and to ensure the quality and utility of the conversion a structured method is required. Moreover, if different thesauri are to be interoperable without complicated mappings, a standard schema for thesauri is required. This paper presents a method for conversion of thesauri to the SKOS RDF/OWL schema, which is a proposal for such a standard under development by W3Cs Semantic Web Best Practices Working Group. We apply the method to three thesauri: IPSV, GTAA and MeSH. With these case studies we evaluate our method and the applicability of SKOS for representing thesauri.
-
Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002)
0.00
0.002189429 = product of:
0.004378858 = sum of:
0.004378858 = product of:
0.008757716 = sum of:
0.008757716 = weight(_text_:a in 5226) [ClassicSimilarity], result of:
0.008757716 = score(doc=5226,freq=20.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.20142901 = fieldWeight in 5226, product of:
4.472136 = tf(freq=20.0), with freq of:
20.0 = termFreq=20.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0390625 = fieldNorm(doc=5226)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list. Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.
- Type
- a
-
Fischer, D.H.: Converting a thesaurus to OWL : Notes on the paper "The National Cancer Institute's Thesaurus and Ontology" (2004)
0.00
0.0021674242 = product of:
0.0043348484 = sum of:
0.0043348484 = product of:
0.008669697 = sum of:
0.008669697 = weight(_text_:a in 2362) [ClassicSimilarity], result of:
0.008669697 = score(doc=2362,freq=40.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.19940455 = fieldWeight in 2362, product of:
6.3245554 = tf(freq=40.0), with freq of:
40.0 = termFreq=40.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.02734375 = fieldNorm(doc=2362)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- The paper analysed here is a kind of position paper. In order to get a better under-standing of the reported work I used the retrieval interface of the thesaurus, the so-called NCI DTS Browser accessible via the Web3, and I perused the cited OWL file4 with numerous "Find" and "Find next" string searches. In addition the file was im-ported into Protégé 2000, Release 2.0, with OWL Plugin 1.0 and Racer Plugin 1.7.14. At the end of the paper's introduction the authors say: "In the following sections, this paper will describe the terminology development process at NCI, and the issues associated with converting a description logic based nomenclature to a semantically rich OWL ontology." While I will not deal with the first part, i.e. the terminology development process at NCI, I do not see the thesaurus as a description logic based nomenclature, or its cur-rent state and conversion already result in a "rich" OWL ontology. What does "rich" mean here? According to my view there is a great quantity of concepts and links but a very poor description logic structure which enables inferences. And what does the fol-lowing really mean, which is said a few lines previously: "Although editors have defined a number of named ontologic relations to support the description-logic based structure of the Thesaurus, additional relation-ships are considered for inclusion as required to support dependent applications."
According to my findings several relations available in the thesaurus query interface as "roles", are not used, i.e. there are not yet any assertions with them. And those which are used do not contribute to complete concept definitions of concepts which represent thesaurus main entries. In other words: The authors claim to already have a "description logic based nomenclature", where there is not yet one which deserves that title by being much more than a thesaurus with strict subsumption and additional inheritable semantic links. In the last section of the paper the authors say: "The most time consuming process in this conversion was making a careful analysis of the Thesaurus to understand the best way to translate it into OWL." "For other conversions, these same types of distinctions and decisions must be made. The expressive power of a proprietary encoding can vary widely from that in OWL or RDF. Understanding the original semantics and engineering a solution that most closely duplicates it is critical for creating a useful and accu-rate ontology." My question is: What decisions were made and are they exemplary, can they be rec-ommended as "the best way"? I raise strong doubts with respect to that, and I miss more profound discussions of the issues at stake. The following notes are dedicated to a critical description and assessment of the results of that conversion activity. They are written in a tutorial style more or less addressing students, but myself being a learner especially in the field of medical knowledge representation I do not speak "ex cathedra".
-
Gilchrist, A.: Thesauri, taxonomies and ontologies : an etymological note (2003)
0.00
0.0021674242 = product of:
0.0043348484 = sum of:
0.0043348484 = product of:
0.008669697 = sum of:
0.008669697 = weight(_text_:a in 4455) [ClassicSimilarity], result of:
0.008669697 = score(doc=4455,freq=10.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.19940455 = fieldWeight in 4455, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=4455)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- The amount of work to be done in rendering the digital information space more efficient and effective has attracted a wide range of disciplines which, in turn, has given rise to a degree of confusion in the terminology applied to information problems. This note seeks to shed some light on the three terms thesauri, taxonomies and ontologies as they are currently being used by, among others, information scientists, AI practitioners, and those working on the foundations of the semantic Web. The paper is not a review of the techniques themselves.
- Type
- a
-
Quick Guide to Publishing a Thesaurus on the Semantic Web (2008)
0.00
0.0021674242 = product of:
0.0043348484 = sum of:
0.0043348484 = product of:
0.008669697 = sum of:
0.008669697 = weight(_text_:a in 4656) [ClassicSimilarity], result of:
0.008669697 = score(doc=4656,freq=10.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.19940455 = fieldWeight in 4656, product of:
3.1622777 = tf(freq=10.0), with freq of:
10.0 = termFreq=10.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=4656)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- This document describes in brief how to express the content and structure of a thesaurus, and metadata about a thesaurus, in RDF. Using RDF allows data to be linked to and/or merged with other RDF data by semantic web applications. The Semantic Web, which is based on the Resource Description Framework (RDF), provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.
- Editor
- Miles, A.
-
Shiri, A.A.; Revie, C.; Chowdhury, G.: Thesaurus-assisted search term selection and query expansion : a review of user-centred studies (2002)
0.00
0.002035109 = product of:
0.004070218 = sum of:
0.004070218 = product of:
0.008140436 = sum of:
0.008140436 = weight(_text_:a in 1330) [ClassicSimilarity], result of:
0.008140436 = score(doc=1330,freq=12.0), product of:
0.043477926 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.037706986 = queryNorm
0.18723148 = fieldWeight in 1330, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046875 = fieldNorm(doc=1330)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing an studies that adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken an the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summarises the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections: first, studies an thesaurus-aided search term selection; and second, studies dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach.
- Type
- a