Search (6 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × theme_ss:"Konzeption und Anwendung des Prinzips Thesaurus"
  1. Zimmermann, H.H.: Automatische Indexierung und elektronische Thesauri (1996) 0.02
    0.022294924 = product of:
      0.044589847 = sum of:
        0.042709544 = weight(_text_:von in 2062) [ClassicSimilarity], result of:
          0.042709544 = score(doc=2062,freq=4.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.3334936 = fieldWeight in 2062, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0625 = fieldNorm(doc=2062)
        0.0018803024 = product of:
          0.005640907 = sum of:
            0.005640907 = weight(_text_:a in 2062) [ClassicSimilarity], result of:
              0.005640907 = score(doc=2062,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.10191591 = fieldWeight in 2062, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2062)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Überblick über die Möglichkeiten des Einsatzes automatischer Indexierung für die Erschließung von Textdokumenten mit einer Kurzvorstellung der Verfahren PASSAT, CTX und IDX sowie einer Skizze des Nutzens der Einbindung von Thesauri in den Prozess der automatischen Indexierung.
    Type
    a
  2. Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.01
    0.009987781 = product of:
      0.039951123 = sum of:
        0.039951123 = weight(_text_:von in 3081) [ClassicSimilarity], result of:
          0.039951123 = score(doc=3081,freq=14.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.3119547 = fieldWeight in 3081, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.25 = coord(1/4)
    
    Abstract
    Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."
  3. Milstead, J.L.: Thesauri in a full-text world (1998) 0.01
    0.006437426 = product of:
      0.025749704 = sum of:
        0.025749704 = product of:
          0.038624555 = sum of:
            0.006106462 = weight(_text_:a in 2337) [ClassicSimilarity], result of:
              0.006106462 = score(doc=2337,freq=6.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.11032722 = fieldWeight in 2337, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
            0.032518093 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
              0.032518093 = score(doc=2337,freq=2.0), product of:
                0.16809508 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04800207 = queryNorm
                0.19345059 = fieldWeight in 2337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
    Type
    a
  4. Siebenkäs, A.; Markscheffel, B.: Conception of a workflow for the semi-automatic construction of a thesaurus for the German printing industry (2015) 0.00
    0.0011633779 = product of:
      0.0046535116 = sum of:
        0.0046535116 = product of:
          0.013960535 = sum of:
            0.013960535 = weight(_text_:a in 2091) [ClassicSimilarity], result of:
              0.013960535 = score(doc=2091,freq=16.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.25222903 = fieldWeight in 2091, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2091)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    During the BMWI granted project "Print-IT", the need of a thesaurus based uniform and consistent language for the German printing industry became evident. In this paper we introduce a semi-automatic construction approach for such a thesaurus and present a workflow which supports users to generate thesaurus typical information structures from relevant digitalized resources with the help of common IT-tools.
    Type
    a
  5. Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.00
    7.051135E-4 = product of:
      0.002820454 = sum of:
        0.002820454 = product of:
          0.008461362 = sum of:
            0.008461362 = weight(_text_:a in 1016) [ClassicSimilarity], result of:
              0.008461362 = score(doc=1016,freq=18.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.15287387 = fieldWeight in 1016, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1016)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre-indexed with 4 different thesauri (AGROVOC [UN Food and Agriculture Organization], high-energy physics taxonomy [HEP], National Agricultural Library Thesaurus [NALT], and medical subject headings [MeSH]). We also introduce a thesaurus-centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP) of 9% for HEP, 11% for MeSH, 35% for NALT, and 37% for AGROVOC. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.
    Type
    a
  6. Thönssen, B.: Automatische Indexierung und Schnittstellen zu Thesauri (1988) 0.00
    5.875945E-4 = product of:
      0.002350378 = sum of:
        0.002350378 = product of:
          0.007051134 = sum of:
            0.007051134 = weight(_text_:a in 30) [ClassicSimilarity], result of:
              0.007051134 = score(doc=30,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.12739488 = fieldWeight in 30, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=30)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Type
    a