Search (6 results, page 1 of 1)

Siebenkäs, A.; Markscheffel, B.: Conception of a workflow for the semi-automatic construction of a thesaurus for the German printing industry (2015) 0.04
```
0.041552994 = product of:
  0.20776497 = sum of:
    0.20776497 = weight(_text_:thesaurus in 2091) [ClassicSimilarity], result of:
      0.20776497 = score(doc=2091,freq=12.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.8754312 = fieldWeight in 2091, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2091)
  0.2 = coord(1/5)
```
Abstract

During the BMWI granted project "Print-IT", the need of a thesaurus based uniform and consistent language for the German printing industry became evident. In this paper we introduce a semi-automatic construction approach for such a thesaurus and present a workflow which supports users to generate thesaurus typical information structures from relevant digitalized resources with the help of common IT-tools.

Object

MIDOS Thesaurus

Theme

Konzeption und Anwendung des Prinzips Thesaurus

Milstead, J.L.: Thesauri in a full-text world (1998) 0.04

0.03815076 = product of:
  0.0953769 = sum of:
    0.06058549 = weight(_text_:thesaurus in 2337) [ClassicSimilarity], result of:
      0.06058549 = score(doc=2337,freq=2.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.2552809 = fieldWeight in 2337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2337)
    0.03479141 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
      0.03479141 = score(doc=2337,freq=2.0), product of:
        0.1798465 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051357865 = queryNorm
        0.19345059 = fieldWeight in 2337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2337)
  0.4 = coord(2/5)

Date: 22. 9.1997 19:16:05
Theme: Konzeption und Anwendung des Prinzips Thesaurus

Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.03
```
0.025647065 = product of:
  0.12823533 = sum of:
    0.12823533 = weight(_text_:thesaurus in 1016) [ClassicSimilarity], result of:
      0.12823533 = score(doc=1016,freq=14.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.5403279 = fieldWeight in 1016, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.03125 = fieldNorm(doc=1016)
  0.2 = coord(1/5)
```
Abstract

Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre-indexed with 4 different thesauri (AGROVOC [UN Food and Agriculture Organization], high-energy physics taxonomy [HEP], National Agricultural Library Thesaurus [NALT], and medical subject headings [MeSH]). We also introduce a thesaurus-centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP) of 9% for HEP, 11% for MeSH, 35% for NALT, and 37% for AGROVOC. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.

Theme

Konzeption und Anwendung des Prinzips Thesaurus

Thönssen, B.: Automatische Indexierung und Schnittstellen zu Thesauri (1988) 0.02

0.024234196 = product of:
  0.12117098 = sum of:
    0.12117098 = weight(_text_:thesaurus in 30) [ClassicSimilarity], result of:
      0.12117098 = score(doc=30,freq=2.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.5105618 = fieldWeight in 30, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.078125 = fieldNorm(doc=30)
  0.2 = coord(1/5)

Theme: Konzeption und Anwendung des Prinzips Thesaurus

Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.02
```
0.023744568 = product of:
  0.11872284 = sum of:
    0.11872284 = weight(_text_:thesaurus in 3081) [ClassicSimilarity], result of:
      0.11872284 = score(doc=3081,freq=12.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.5002464 = fieldWeight in 3081, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.03125 = fieldNorm(doc=3081)
  0.2 = coord(1/5)
```
Abstract

Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."

Theme

Konzeption und Anwendung des Prinzips Thesaurus

Zimmermann, H.H.: Automatische Indexierung und elektronische Thesauri (1996) 0.02

0.019387359 = product of:
  0.09693679 = sum of:
    0.09693679 = weight(_text_:thesaurus in 2062) [ClassicSimilarity], result of:
      0.09693679 = score(doc=2062,freq=2.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.40844947 = fieldWeight in 2062, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.0625 = fieldNorm(doc=2062)
  0.2 = coord(1/5)

Theme: Konzeption und Anwendung des Prinzips Thesaurus

Search (6 results, page 1 of 1)

Authors

Years

Languages

Types

Themes