Search (239 results, page 1 of 12)

Pollard, R.: Hypertext presentation of thesauri used in on-line searching (1990) 0.05

0.04985508 = product of:
  0.14956523 = sum of:
    0.10768185 = weight(_text_:line in 4892) [ClassicSimilarity], result of:
      0.10768185 = score(doc=4892,freq=2.0), product of:
        0.21724595 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.038739666 = queryNorm
        0.4956679 = fieldWeight in 4892, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.0625 = fieldNorm(doc=4892)
    0.010552166 = weight(_text_:information in 4892) [ClassicSimilarity], result of:
      0.010552166 = score(doc=4892,freq=2.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.1551638 = fieldWeight in 4892, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=4892)
    0.031331215 = weight(_text_:retrieval in 4892) [ClassicSimilarity], result of:
      0.031331215 = score(doc=4892,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.26736724 = fieldWeight in 4892, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=4892)
  0.33333334 = coord(3/9)

Abstract: Explores the strengths and limitations of hypertext for the online presentation of thesauri used in information retrieval. Examines the ability of hypertext to support each of 3 common types of thesaurus display: graphic, alphabetical, and hierarchical. Presents a design for a hypertext-based hierarchical display that addresses many inadequacies of printed hierarchical displays. Ullustrates how the design might be implemented using a commercially available hypertext system. Considers issues related to the implementation and evaluation of hypertext-based thesauri

Keyser, P. de: Indexing : from thesauri to the Semantic Web (2012) 0.05

0.046964016 = product of:
  0.14089204 = sum of:
    0.013707667 = weight(_text_:information in 3197) [ClassicSimilarity], result of:
      0.013707667 = score(doc=3197,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.20156369 = fieldWeight in 3197, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3197)
    0.11143831 = weight(_text_:techniques in 3197) [ClassicSimilarity], result of:
      0.11143831 = score(doc=3197,freq=10.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.65299517 = fieldWeight in 3197, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=3197)
    0.01574607 = product of:
      0.03149214 = sum of:
        0.03149214 = weight(_text_:22 in 3197) [ClassicSimilarity], result of:
          0.03149214 = score(doc=3197,freq=2.0), product of:
            0.13565971 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038739666 = queryNorm
            0.23214069 = fieldWeight in 3197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3197)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: Indexing consists of both novel and more traditional techniques. Cutting-edge indexing techniques, such as automatic indexing, ontologies, and topic maps, were developed independently of older techniques such as thesauri, but it is now recognized that these older methods also hold expertise. Indexing describes various traditional and novel indexing techniques, giving information professionals and students of library and information sciences a broad and comprehensible introduction to indexing. This title consists of twelve chapters: an Introduction to subject readings and theasauri; Automatic indexing versus manual indexing; Techniques applied in automatic indexing of text material; Automatic indexing of images; The black art of indexing moving images; Automatic indexing of music; Taxonomies and ontologies; Metadata formats and indexing; Tagging; Topic maps; Indexing the web; and The Semantic Web.
Date: 24. 8.2016 14:03:22
Series: Chandos information professional series

MacFarlane, A.: Knowledge organisation and its role in multimedia information retrieval (2016) 0.05

0.045577165 = product of:
  0.13673149 = sum of:
    0.013707667 = weight(_text_:information in 2911) [ClassicSimilarity], result of:
      0.013707667 = score(doc=2911,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.20156369 = fieldWeight in 2911, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2911)
    0.05254405 = weight(_text_:retrieval in 2911) [ClassicSimilarity], result of:
      0.05254405 = score(doc=2911,freq=10.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.44838852 = fieldWeight in 2911, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2911)
    0.07047977 = weight(_text_:techniques in 2911) [ClassicSimilarity], result of:
      0.07047977 = score(doc=2911,freq=4.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.4129904 = fieldWeight in 2911, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=2911)
  0.33333334 = coord(3/9)

Abstract: Various kinds of knowledge organisation, such as thesauri, are routinely used to label or tag multimedia content such as images and music and to support information retrieval, i.e. user search for such content. In this paper, we outline why this is the case, in particular focusing on the semantic gap between content and concept based multimedia retrieval. We survey some indexing vocabularies used for multimedia retrieval, and argue that techniques such as thesauri will be needed for the foreseeable future in order to support users in their need for multimedia content. In particular, we argue that artificial intelligence techniques are not mature enough to solve the problem of indexing multimedia conceptually and will not be able to replace human indexers for the foreseeable future.
Content: Beitrag in einem Special issue: The Great Debate: "This House Believes that the Traditional Thesaurus has no Place in Modern Information Retrieval." [19 February 2015, 14:00-17:30 preceded by ISKO UK AGM and followed by networking, wine and nibbles; vgl.: http://www.iskouk.org/content/great-debate].

Srinivasan, P.: Thesaurus construction (1992) 0.04

0.036008604 = product of:
  0.10802581 = sum of:
    0.011192262 = weight(_text_:information in 3504) [ClassicSimilarity], result of:
      0.011192262 = score(doc=3504,freq=4.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.16457605 = fieldWeight in 3504, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3504)
    0.046996824 = weight(_text_:retrieval in 3504) [ClassicSimilarity], result of:
      0.046996824 = score(doc=3504,freq=8.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.40105087 = fieldWeight in 3504, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3504)
    0.049836725 = weight(_text_:techniques in 3504) [ClassicSimilarity], result of:
      0.049836725 = score(doc=3504,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.2920283 = fieldWeight in 3504, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=3504)
  0.33333334 = coord(3/9)

Abstract: Thesauri are valuable structures for Information Retrieval systems. A thesaurus provides a precise and controlled vocabulary which serves to coordinate dacument indexing and document retrieval. In both indexing and retrieval, a thesaurus may be used to select the most appropriate terms. Additionally, the thesaurus can assist the searcher in reformulating search strategies if required. Examines the important features of thesauri. This should allow the reader to differentiate between thesauri. Next, a brief overview of the manual thesaurus construction process is given. 2 major approaches for automatic thesaurus construction have been selected for detailed examination. The first is on thesaurus construction from collections of documents,a nd the 2nd, on thesaurus construction by merging existing thesauri. These 2 methods were selected since they rely on statistical techniques alone and are also significantly different from each other. Programs written in C language accompany the discussion of these approaches
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Dextre Clarke, S.G.: Origins and trajectory of the long thesaurus debate (2016) 0.03

0.033992358 = product of:
  0.101977065 = sum of:
    0.009326885 = weight(_text_:information in 2913) [ClassicSimilarity], result of:
      0.009326885 = score(doc=2913,freq=4.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.13714671 = fieldWeight in 2913, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2913)
    0.033917036 = weight(_text_:retrieval in 2913) [ClassicSimilarity], result of:
      0.033917036 = score(doc=2913,freq=6.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.28943354 = fieldWeight in 2913, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2913)
    0.058733147 = weight(_text_:techniques in 2913) [ClassicSimilarity], result of:
      0.058733147 = score(doc=2913,freq=4.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.34415868 = fieldWeight in 2913, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2913)
  0.33333334 = coord(3/9)

Abstract: The information retrieval thesaurus emerged in the 1950s, settled down to a more-or-less standard format in the 1970s and has continued to evolve marginally since then. Throughout its whole lifetime, doubts have been expressed about its efficacy with emphasis latterly on cost-effectiveness. Prolonged testing of different styles of index language in the 1970s failed to settle the doubts. The arena occupied by the debate has moved from small isolated databases in the post-war era to diverse situations nowadays with the whole Internet at one extreme and small in-house collections at the other. Sophisticated statistical techniques now dominate the retrieval landscape on the Internet but leave opportunities for the thesaurus and other knowledge organization techniques in niches such as image libraries and corporate intranets. The promise of an ontology-driven semantic web with linked data resources opens another opportunity. Thus much scope remains for research to establish the usefulness of the thesaurus in these places and to inspire its continuing evolution.
Content: Beitrag in einem Special issue: The Great Debate: "This House Believes that the Traditional Thesaurus has no Place in Modern Information Retrieval." [19 February 2015, 14:00-17:30 preceded by ISKO UK AGM and followed by networking, wine and nibbles; vgl.: http://www.iskouk.org/content/great-debate].

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.03

0.033624496 = product of:
  0.100873485 = sum of:
    0.022384524 = weight(_text_:information in 4483) [ClassicSimilarity], result of:
      0.022384524 = score(doc=4483,freq=4.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.3291521 = fieldWeight in 4483, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
    0.046996824 = weight(_text_:retrieval in 4483) [ClassicSimilarity], result of:
      0.046996824 = score(doc=4483,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.40105087 = fieldWeight in 4483, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
    0.03149214 = product of:
      0.06298428 = sum of:
        0.06298428 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.06298428 = score(doc=4483,freq=2.0), product of:
            0.13565971 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038739666 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Date: 15. 3.2000 10:22:37
Source: Journal of information science. 25(1999) no.2, S.113-131

Jones, S.: ¬A thesaurus data model for an intelligent retrieval system (1993) 0.03

0.032258723 = product of:
  0.096776165 = sum of:
    0.013707667 = weight(_text_:information in 5279) [ClassicSimilarity], result of:
      0.013707667 = score(doc=5279,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.20156369 = fieldWeight in 5279, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5279)
    0.033231772 = weight(_text_:retrieval in 5279) [ClassicSimilarity], result of:
      0.033231772 = score(doc=5279,freq=4.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.2835858 = fieldWeight in 5279, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5279)
    0.049836725 = weight(_text_:techniques in 5279) [ClassicSimilarity], result of:
      0.049836725 = score(doc=5279,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.2920283 = fieldWeight in 5279, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=5279)
  0.33333334 = coord(3/9)

Abstract: This paper demonstrates the application of conventional database design techniques to thesaurus representation. The thesaurus is considered as a printed document, as a semantic net, and as a relational database to be used in conjunction with an intelligent information retrieval system. Some issues raised by analysis of two standard thesauri include: the prevalence of compound terms and the representation of term structure; thesaurus redundancy and the extent to which it can be eliminated in machine-readable versions; the difficulty of exploiting thesaurus knowledge originally designed for human rather than automatic interpretation; deriving 'strength of association' measures between terms in a thesaurus considered as a semantic net; facet representation and the need for variations in the data model to cater for structural differences between thesauri. A complete schema of database tables is presented, with an outline suggestion for using the stored information when matching one or more thesaurus terms with a user's query
Source: Journal of information science. 19(1993), S.167-178

Youlin, Z.; Baptista Nunes, J.M.; Zhonghua, D.: Construction and evolution of a Chinese Information Science and Information Service (CIS&IS) onto-thesaurus (2014) 0.03

0.031906556 = product of:
  0.095719665 = sum of:
    0.022384524 = weight(_text_:information in 1376) [ClassicSimilarity], result of:
      0.022384524 = score(doc=1376,freq=16.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.3291521 = fieldWeight in 1376, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1376)
    0.023498412 = weight(_text_:retrieval in 1376) [ClassicSimilarity], result of:
      0.023498412 = score(doc=1376,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.20052543 = fieldWeight in 1376, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1376)
    0.049836725 = weight(_text_:techniques in 1376) [ClassicSimilarity], result of:
      0.049836725 = score(doc=1376,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.2920283 = fieldWeight in 1376, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=1376)
  0.33333334 = coord(3/9)

Abstract: Thesauri are the most important tools for information and knowledge organization, and they undergo regular improvements according to the rapid development of new requirements and affordances of emerging information techniques. This paper attempts to integrate ontology into the conceptual organization scheme of thesauri and proposes a new solution to extend the functionality of thesauri based on ontological features, which is termed here as an onto-thesaurus. In this study, a prototype system named the Chinese Information Science and Information Service onto-thesaurus system (CIS&IS), was developed to analyze ontothesaurus with the category of information science and information service in the Chinese Topic Classification Dictionary with a two-stage approach. The first stage aims to define and construct the onto-thesaurus. The second stage aims to realize the evolution function of onto-thesaurus. The main purpose of this system was to achieve the function of self-learning and auto-evolution and to enable a much more effective conceptual retrieval by the newly proposed onto-thesaurus.

Garshol, L.M.: Metadata? Thesauri? Taxonomies? Topic Maps! : making sense of it all (2005) 0.03
```
0.029071974 = product of:
  0.13082388 = sum of:
    0.019385567 = weight(_text_:information in 4729) [ClassicSimilarity], result of:
      0.019385567 = score(doc=4729,freq=12.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.2850541 = fieldWeight in 4729, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4729)
    0.11143831 = weight(_text_:techniques in 4729) [ClassicSimilarity], result of:
      0.11143831 = score(doc=4729,freq=10.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.65299517 = fieldWeight in 4729, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=4729)
  0.22222222 = coord(2/9)
```
Abstract

The task of an information architect is to create web sites where users can actually find the information they are looking for. As the ocean of information rises and leaves what we seek ever more deeply buried in what we don't seek, this discipline becomes ever more relevant. Information architecture involves many different aspects of web site creation and organization, but its principal tools are information organization techniques developed in other disciplines. Most of these techniques come from library science, such as thesauri, taxonomies, and faceted classification. Topic maps are a relative newcomer to this area and bring with them the promise of better-organized web sites, compared to what is possible with existing techniques. However, it is not generally understood how topic maps relate to the traditional techniques, and what advantages and disadvantages they have, compared to these techniques. The aim of this paper is to help build a better understanding of these issues.

Source

Journal of information science. 30(2005) no.4, S.378-391
Li, K.W.; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis (2005) 0.03
```
0.028027367 = product of:
  0.0840821 = sum of:
    0.015828248 = weight(_text_:information in 3391) [ClassicSimilarity], result of:
      0.015828248 = score(doc=3391,freq=18.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.23274568 = fieldWeight in 3391, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.035029367 = weight(_text_:retrieval in 3391) [ClassicSimilarity], result of:
      0.035029367 = score(doc=3391,freq=10.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.29892567 = fieldWeight in 3391, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.033224486 = weight(_text_:techniques in 3391) [ClassicSimilarity], result of:
      0.033224486 = score(doc=3391,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.19468555 = fieldWeight in 3391, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
  0.33333334 = coord(3/9)
```
Abstract

For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.3, S.272-281

Gilchrist, A.: ¬The thesaurus in retrieval (1971) 0.03

0.027180506 = product of:
  0.12231228 = sum of:
    0.026380414 = weight(_text_:information in 4593) [ClassicSimilarity], result of:
      0.026380414 = score(doc=4593,freq=8.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.38790947 = fieldWeight in 4593, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=4593)
    0.095931865 = weight(_text_:retrieval in 4593) [ClassicSimilarity], result of:
      0.095931865 = score(doc=4593,freq=12.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.81864166 = fieldWeight in 4593, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4593)
  0.22222222 = coord(2/9)

LCSH: Information retrieval
RSWK: Schlagwortnormdatei / Information Retrieval
Subject: Schlagwortnormdatei / Information Retrieval
Information retrieval
Theme: Verbale Doksprachen im Online-Retrieval

Z39.19-2005: Guidelines for the construction, format, and management of monolingual controlled vocabularies (2005) 0.03

0.02708309 = product of:
  0.08124927 = sum of:
    0.007914125 = weight(_text_:information in 708) [ClassicSimilarity], result of:
      0.007914125 = score(doc=708,freq=2.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.116372846 = fieldWeight in 708, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=708)
    0.023498412 = weight(_text_:retrieval in 708) [ClassicSimilarity], result of:
      0.023498412 = score(doc=708,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.20052543 = fieldWeight in 708, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=708)
    0.049836725 = weight(_text_:techniques in 708) [ClassicSimilarity], result of:
      0.049836725 = score(doc=708,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.2920283 = fieldWeight in 708, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=708)
  0.33333334 = coord(3/9)

Abstract: This Standard presents guidelines and conventions for the contents, display, construction, testing, maintenance, and management of monolingual controlled vocabularies. This Standard focuses on controlled vocabularies that are used for the representation of content objects in knowledge organization systems including lists, synonym rings, taxonomies, and thesauri. This Standard should be regarded as a set of recommendations based on preferred techniques and procedures. Optional procedures are, however, sometimes described, e.g., for the display of terms in a controlled vocabulary. The primary purpose of vocabulary control is to achieve consistency in the description of content objects and to facilitate retrieval. Vocabulary control is accomplished by three principal methods: defining the scope, or meaning, of terms; using the equivalence relationship to link synonymous and nearly synonymous terms; and distinguishing among homographs.
Editor: National Information Standards Organization

Eckert, K.: Thesaurus analysis and visualization in semantic search applications (2007) 0.03

0.026882272 = product of:
  0.08064681 = sum of:
    0.011423056 = weight(_text_:information in 3222) [ClassicSimilarity], result of:
      0.011423056 = score(doc=3222,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.16796975 = fieldWeight in 3222, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.027693143 = weight(_text_:retrieval in 3222) [ClassicSimilarity], result of:
      0.027693143 = score(doc=3222,freq=4.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.23632148 = fieldWeight in 3222, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.04153061 = weight(_text_:techniques in 3222) [ClassicSimilarity], result of:
      0.04153061 = score(doc=3222,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.24335694 = fieldWeight in 3222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
  0.33333334 = coord(3/9)

Abstract: The use of thesaurus-based indexing is a common approach for increasing the performance of information retrieval. In this thesis, we examine the suitability of a thesaurus for a given set of information and evaluate improvements of existing thesauri to get better search results. On this area, we focus on two aspects: 1. We demonstrate an analysis of the indexing results achieved by an automatic document indexer and the involved thesaurus. 2. We propose a method for thesaurus evaluation which is based on a combination of statistical measures and appropriate visualization techniques that support the detection of potential problems in a thesaurus. In this chapter, we give an overview of the context of our work. Next, we briefly outline the basics of thesaurus-based information retrieval and describe the Collexis Engine that was used for our experiments. In Chapter 3, we describe two experiments in automatically indexing documents in the areas of medicine and economics with corresponding thesauri and compare the results to available manual annotations. Chapter 4 describes methods for assessing thesauri and visualizing the result in terms of a treemap. We depict examples of interesting observations supported by the method and show that we actually find critical problems. We conclude with a discussion of open questions and future research in Chapter 5.

¬The thesaurus: review, renaissance and revision (2004) 0.03
```
0.02667768 = product of:
  0.080033034 = sum of:
    0.013707667 = weight(_text_:information in 3243) [ClassicSimilarity], result of:
      0.013707667 = score(doc=3243,freq=24.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.20156369 = fieldWeight in 3243, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3243)
    0.031085476 = weight(_text_:retrieval in 3243) [ClassicSimilarity], result of:
      0.031085476 = score(doc=3243,freq=14.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.2652702 = fieldWeight in 3243, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3243)
    0.035239886 = weight(_text_:techniques in 3243) [ClassicSimilarity], result of:
      0.035239886 = score(doc=3243,freq=4.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.2064952 = fieldWeight in 3243, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3243)
  0.33333334 = coord(3/9)
```
Content

Enthält u.a. folgende Aussage von J. Aitchison u. S. Dextre Clarke: "We face a paradox. Ostensibly, the need and the opportunity to apply thesauri to information retrieval are greater than ever before. On the other hand, users resist most efforts to persuade them to apply one. The drive for interoperability of systems means we must design our vocabularies for easy integration into downstream applications such as content management systems, indexing/metatagging interfaces, search engines, and portals. Summarizing the search for vocabularies that work more intuitively, we see that there are trends working in opposite directions. In the hugely popular taxonomies an the one hand, relationships between terms are more loosely defined than in thesauri. In the ontologies that will support computer-to-computer communications in AI applications such as the Semantic Web, we see the need for much more precisely defined term relationships."
Enthält die Beiträge: Spiteri, L.F.: Word association testing and thesaurus construction: a pilot study. Aitchison, J., S.G. Dextre-Clarke: The Thesaurus: a historical viewpoint, with a look to the future. Thomas, A.R.: Teach yourself thesaurus: exercises, reading, resources. Shearer, J.R.: A practical exercise in building a thesaurus. Nielsen, M.L.: Thesaurus construction: key issues and selected readings. Riesland, M.A.: Tools of the trade: vocabulary management software. Will, L.: Thesaurus consultancy. Owens, L.A., P.A. Cochrane: Thesaurus evaluation. Greenberg, J.: User comprehension and application of information retrieval thesauri. Johnson, E.H.: Distributed thesaurus Web services. Thomas, A.R., S.K. Roe: An interview with Dr. Amy J. Warner. Landry, P.: Multilingual subject access: the linking approach of MACS.

Footnote

Rez. in: KO 32(2005) no.2, S.95-97 (A. Gilchrist):"It might be thought unfortunate that the word thesaurus is assonant with prehistoric beasts but as this book clearly demonstrates, the thesaurus is undergoing a notable revival, and we can remind ourselves that the word comes from the Greek thesaurus, meaning a treasury. This is a useful and timely source book, bringing together ten chapters, following an Editorial introduction and culminating in an interview with a member of the team responsible for revising the NISO Standard Guidelines for the construction, format and management of monolingual thesauri; formal proof of the thesaural renaissance. Though predominantly an American publication, it is good to see four English authors as well as one from Canada and one from Denmark; and with a good balance of academics and practitioners. This has helped to widen the net in the citing of useful references. While the techniques of thesaurus construction are still basically sound, the Editors, in their introduction, point out that the thesaurus, in its sense of an information retrieval tool is almost exactly 50 years old, and that the information environment of today is radically different. They claim three purposes for the compilation: "to acquaint or remind the Library and Information Science community of the history of the development of the thesaurus and standards for thesaurus construction. to provide bibliographies and tutorials from which any reader can become more grounded in her or his understanding of thesaurus construction, use and evaluation. to address topics related to thesauri but that are unique to the current digital environment, or network of networks." This last purpose, understandably, tends to be the slightly more tentative part of the book, but as Rosenfeld and Morville said in their book Information architecture for the World Wide Web "thesauri [will] become a key tool for dealing with the growing size and importance of web sites and intranets". The evidence supporting their belief has been growing steadily in the seven years since the first edition was published.
The didactic parts of the book are a collection of exercises, readings and resources constituting a "Teach yourself " chapter written by Alan Thomas, ending with the warning that "New challenges include how to devise multi-functional and usersensitive vocabularies, corporate taxonomies and ontologies, and how to apply the transformative technology to them." This is absolutely right, and there is a need for some good writing that would tackle these issues. Another chapter, by James Shearer, skilfully manages to compress a practical exercise in building a thesaurus into some twenty A5 size pages. The third chapter in this set, by Marianne Lykke Nielsen, contains extensive reviews of key issues and selected readings under eight headings from the concept of the thesaurus, through the various construction stages and ending with automatic construction techniques. . . . This is a useful and approachable book. It is a pity that the index is such a poor advertisement for vocabulary control and usefulness."

LCSH

Information retrieval
Electronic information resource searching

RSWK

Informations- und Dokumentationswissenschaft / Information Retrieval / Inhaltserschließung / Thesaurus (BVB)

Subject

Informations- und Dokumentationswissenschaft / Information Retrieval / Inhaltserschließung / Thesaurus (BVB)
Information retrieval
Electronic information resource searching

Velasco, M.: Algoritmo de filtrado multitermino para la obtencion de relaciones jerarquicas en la construction automatica de un tesauro de descriptores (1999) 0.02

0.024289927 = product of:
  0.10930467 = sum of:
    0.08306122 = weight(_text_:techniques in 348) [ClassicSimilarity], result of:
      0.08306122 = score(doc=348,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.4867139 = fieldWeight in 348, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.078125 = fieldNorm(doc=348)
    0.02624345 = product of:
      0.0524869 = sum of:
        0.0524869 = weight(_text_:22 in 348) [ClassicSimilarity], result of:
          0.0524869 = score(doc=348,freq=2.0), product of:
            0.13565971 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038739666 = queryNorm
            0.38690117 = fieldWeight in 348, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=348)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)

Footnote: Übers. d. Titels: Statistical filtering techniques applied to the obtention of hierarchical relationships in the automatic construction of a thesaurus
Source: Revista Española de Documentaçion Cientifica. 22(1999) no.1, S.34-49

Mazzocchi, F.; Tiberi, M.; De Santis, B.; Plini, P.: Relational semantics in thesauri : an overview and some remarks at theoretical and practical levels (2007) 0.02

0.023681607 = product of:
  0.07104482 = sum of:
    0.011423056 = weight(_text_:information in 1462) [ClassicSimilarity], result of:
      0.011423056 = score(doc=1462,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.16796975 = fieldWeight in 1462, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1462)
    0.027693143 = weight(_text_:retrieval in 1462) [ClassicSimilarity], result of:
      0.027693143 = score(doc=1462,freq=4.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.23632148 = fieldWeight in 1462, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1462)
    0.031928614 = product of:
      0.06385723 = sum of:
        0.06385723 = weight(_text_:theories in 1462) [ClassicSimilarity], result of:
          0.06385723 = score(doc=1462,freq=2.0), product of:
            0.21161452 = queryWeight, product of:
              5.4624767 = idf(docFreq=509, maxDocs=44218)
              0.038739666 = queryNorm
            0.30176204 = fieldWeight in 1462, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4624767 = idf(docFreq=509, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1462)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: A thesaurus is a controlled vocabulary designed to allow for effective information retrieval. It con- sists of different kinds of semantic relationships, with the aim of guiding users to the choice of the most suitable index and search terms for expressing a certain concept. The relational semantics of a thesaurus deal with methods to connect terms with related meanings and arc intended to enhance information recall capabilities. In this paper, focused on hierarchical relations, different aspects of the relational semantics of thesauri, and among them the possibility of developing richer structures, are analyzed. Thesauri are viewed as semantic tools providing, for operational purposes, the representation of the meaning of the terms. The paper stresses how theories of semantics, holding different perspectives about the nature of meaning and how it is represented, affect the design of the relational semantics of thesauri. The need for tools capable of representing the complexity of knowledge and of the semantics of terms as it occurs in the literature of their respective subject fields is advocated. It is underlined how this would contribute to improving the retrieval of information. To achieve this goal, even though in a preliminary manner, we explore the possibility of setting against the framework of thesaurus design the notions of language games and hermeneutic horizon.

Z39.19-1993: Guidelines for the construction, format, and management of monolingual thesauri (1993) 0.02

0.023534289 = product of:
  0.070602864 = sum of:
    0.018276889 = weight(_text_:information in 4092) [ClassicSimilarity], result of:
      0.018276889 = score(doc=4092,freq=6.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.2687516 = fieldWeight in 4092, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=4092)
    0.031331215 = weight(_text_:retrieval in 4092) [ClassicSimilarity], result of:
      0.031331215 = score(doc=4092,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.26736724 = fieldWeight in 4092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=4092)
    0.02099476 = product of:
      0.04198952 = sum of:
        0.04198952 = weight(_text_:22 in 4092) [ClassicSimilarity], result of:
          0.04198952 = score(doc=4092,freq=2.0), product of:
            0.13565971 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038739666 = queryNorm
            0.30952093 = fieldWeight in 4092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4092)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: This 1993 edition is the authoritative guide constructing single-language thesauri, one of the most powerful tools for information retrieval. Written by experts, Z39.19 shows how to formulate descriptors, establish relationships among terms, and present the information in print and on a screen. Also included are thesaurus maintenance procedures and recommended features for thesaurus management systems
Editor: National Information Standards Organization
Footnote: Rez. in: Knowledge organization 22(1995) no.3/4, S.180-181 (M. Hudon)

Crouch, C.J.: ¬An approach to the automatic construction of global thesauri (1990) 0.02

0.023399485 = product of:
  0.070198454 = sum of:
    0.01305764 = weight(_text_:information in 4042) [ClassicSimilarity], result of:
      0.01305764 = score(doc=4042,freq=4.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.1920054 = fieldWeight in 4042, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4042)
    0.0387704 = weight(_text_:retrieval in 4042) [ClassicSimilarity], result of:
      0.0387704 = score(doc=4042,freq=4.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.33085006 = fieldWeight in 4042, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4042)
    0.018370414 = product of:
      0.03674083 = sum of:
        0.03674083 = weight(_text_:22 in 4042) [ClassicSimilarity], result of:
          0.03674083 = score(doc=4042,freq=2.0), product of:
            0.13565971 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038739666 = queryNorm
            0.2708308 = fieldWeight in 4042, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4042)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: The benefits of a well constructed thesaurus to an information retrieval system have long been recognised by both researchers and practitioners in the field. Examines both early and current approaches to automatic thesaurus construction and describes an approach to the automatic generation of global thesauri based on the term discrimination value model of Salton Yang, and Yu and on an appropriate clustering algorithm. This method has been implemented and applied to 2 document collections. Preliminary results indicate that this method, which produces improvements in retrieval performance in excess of 10 and 15% in the test collections, is viable and worthy of continued investigation.
Date: 22. 4.1996 3:39:53
Source: Information processing and management. 26(1990), no.5, S.629-640

Chen, H.; Yim, T.; Fye, D.: Automatic thesaurus generation for an electronic community system (1995) 0.02

0.022569243 = product of:
  0.067707725 = sum of:
    0.0065951035 = weight(_text_:information in 2918) [ClassicSimilarity], result of:
      0.0065951035 = score(doc=2918,freq=2.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.09697737 = fieldWeight in 2918, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2918)
    0.01958201 = weight(_text_:retrieval in 2918) [ClassicSimilarity], result of:
      0.01958201 = score(doc=2918,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.16710453 = fieldWeight in 2918, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2918)
    0.04153061 = weight(_text_:techniques in 2918) [ClassicSimilarity], result of:
      0.04153061 = score(doc=2918,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.24335694 = fieldWeight in 2918, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2918)
  0.33333334 = coord(3/9)

Abstract: Reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included terms filtering, automatic indexing, and cluster analysis. The testbed for the research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm. The resulting worm thesaurus included 2709 researchers' names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighbouring terms indicating relevant concepts. The thesaurus was developed as an online search aide. Tests the worm thesaurus in an experiment with 6 worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent 'memory jogging' device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers' queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system
Source: Journal of the American Society for Information Science. 46(1995) no.3, S.175-193
Theme: Verbale Doksprachen im Online-Retrieval

Martín-Moncunill, D.; García-Barriocanal, E.; Sicilia, M.-A.; Sánchez-Alonso, S.: Evaluating the practical applicability of thesaurus-based keyphrase extraction in the agricultural domain : insights from the VOA3R project (2015) 0.02

0.022569243 = product of:
  0.067707725 = sum of:
    0.0065951035 = weight(_text_:information in 2106) [ClassicSimilarity], result of:
      0.0065951035 = score(doc=2106,freq=2.0), product of:
        0.06800663 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.038739666 = queryNorm
        0.09697737 = fieldWeight in 2106, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2106)
    0.01958201 = weight(_text_:retrieval in 2106) [ClassicSimilarity], result of:
      0.01958201 = score(doc=2106,freq=2.0), product of:
        0.1171842 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.038739666 = queryNorm
        0.16710453 = fieldWeight in 2106, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2106)
    0.04153061 = weight(_text_:techniques in 2106) [ClassicSimilarity], result of:
      0.04153061 = score(doc=2106,freq=2.0), product of:
        0.17065717 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.038739666 = queryNorm
        0.24335694 = fieldWeight in 2106, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2106)
  0.33333334 = coord(3/9)

Abstract: The use of Knowledge Organization Systems (KOSs) in aggregated metadata collections facilitates the implementation of search mechanisms operating on the same term or keyphrase space, thus preparing the ground for improved browsing, more accurate retrieval and better user profiling. Automatic thesaurus-based keyphrase extraction appears to be an inexpensive tool to obtain this information, but the studies on its effectiveness are scattered and do not consider the practical applicability of these techniques compared to the quality obtained by involving human experts. This paper presents an evaluation of keyphrase extraction using the KEA software and the AGROVOC vocabulary on a sample of a large collection of metadata in the field of agriculture from the AGRIS database. This effort includes a double evaluation, the classical automatic evaluation based on precision and recall measures, plus a blind evaluation aimed to contrast the quality of the keyphrases extracted against expert-provided samples and against the keyphrases originally recorded in the metadata. Results show not only that KEA outperforms humans in matching the original keyphrases, but also that the quality of the keyphrases extracted was similar to those provided by humans.

Search (239 results, page 1 of 12)

Authors

Years

Languages

Types

Themes

Subjects

Classifications