Search (6 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × theme_ss:"Konzeption und Anwendung des Prinzips Thesaurus"
  1. Thönssen, B.: Automatische Indexierung und Schnittstellen zu Thesauri (1988) 0.38
    0.37662664 = product of:
      0.45195198 = sum of:
        0.06281625 = weight(_text_:und in 30) [ClassicSimilarity], result of:
          0.06281625 = score(doc=30,freq=12.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.5998219 = fieldWeight in 30, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
        0.12236831 = weight(_text_:anwendung in 30) [ClassicSimilarity], result of:
          0.12236831 = score(doc=30,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.5349128 = fieldWeight in 30, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
        0.040036436 = weight(_text_:des in 30) [ClassicSimilarity], result of:
          0.040036436 = score(doc=30,freq=2.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.30596817 = fieldWeight in 30, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
        0.17099062 = weight(_text_:prinzips in 30) [ClassicSimilarity], result of:
          0.17099062 = score(doc=30,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.6323167 = fieldWeight in 30, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
        0.055740345 = product of:
          0.11148069 = sum of:
            0.11148069 = weight(_text_:thesaurus in 30) [ClassicSimilarity], result of:
              0.11148069 = score(doc=30,freq=2.0), product of:
                0.21834905 = queryWeight, product of:
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.04725067 = queryNorm
                0.5105618 = fieldWeight in 30, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.078125 = fieldNorm(doc=30)
          0.5 = coord(1/2)
      0.8333333 = coord(5/6)
    
    Abstract
    Über eine Schnittstelle zwischen Programmen zur automatischen Indexierung (PRIMUS-IDX) und zur maschinellen Thesaurusverwaltung (INDEX) sollen große Textmengen schnell, kostengünstig und konsistent erschlossen und verbesserte Recherchemöglichkeiten geschaffen werden. Zielvorstellung ist ein Verfahren, das auf PCs ablauffähig ist und speziell deutschsprachige Texte bearbeiten kann
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  2. Zimmermann, H.H.: Automatische Indexierung und elektronische Thesauri (1996) 0.32
    0.32434347 = product of:
      0.3892122 = sum of:
        0.0458745 = weight(_text_:und in 2062) [ClassicSimilarity], result of:
          0.0458745 = score(doc=2062,freq=10.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.438048 = fieldWeight in 2062, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0625 = fieldNorm(doc=2062)
        0.097894646 = weight(_text_:anwendung in 2062) [ClassicSimilarity], result of:
          0.097894646 = score(doc=2062,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.42793027 = fieldWeight in 2062, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.0625 = fieldNorm(doc=2062)
        0.0640583 = weight(_text_:des in 2062) [ClassicSimilarity], result of:
          0.0640583 = score(doc=2062,freq=8.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.48954904 = fieldWeight in 2062, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0625 = fieldNorm(doc=2062)
        0.1367925 = weight(_text_:prinzips in 2062) [ClassicSimilarity], result of:
          0.1367925 = score(doc=2062,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.50585335 = fieldWeight in 2062, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.0625 = fieldNorm(doc=2062)
        0.044592276 = product of:
          0.08918455 = sum of:
            0.08918455 = weight(_text_:thesaurus in 2062) [ClassicSimilarity], result of:
              0.08918455 = score(doc=2062,freq=2.0), product of:
                0.21834905 = queryWeight, product of:
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.04725067 = queryNorm
                0.40844947 = fieldWeight in 2062, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2062)
          0.5 = coord(1/2)
      0.8333333 = coord(5/6)
    
    Abstract
    Überblick über die Möglichkeiten des Einsatzes automatischer Indexierung für die Erschließung von Textdokumenten mit einer Kurzvorstellung der Verfahren PASSAT, CTX und IDX sowie einer Skizze des Nutzens der Einbindung von Thesauri in den Prozess der automatischen Indexierung.
    Imprint
    Düsseldorf : Universitäts- und Landesbibliothek
    Series
    Schriften der Universitäts- und Landesbibliothek Düsseldorf; Bd.25
    Source
    Zukunft der Sacherschließung im OPAC: Vorträge des 2. Düsseldorfer OPAC-Kolloquiums am 21. Juni 1995. Hrsg.: E. Niggemann u. K. Lepsky
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  3. Siebenkäs, A.; Markscheffel, B.: Conception of a workflow for the semi-automatic construction of a thesaurus for the German printing industry (2015) 0.29
    0.28908566 = product of:
      0.3469028 = sum of:
        0.017951237 = weight(_text_:und in 2091) [ClassicSimilarity], result of:
          0.017951237 = score(doc=2091,freq=2.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.17141339 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.08565781 = weight(_text_:anwendung in 2091) [ClassicSimilarity], result of:
          0.08565781 = score(doc=2091,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.37443897 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.028025504 = weight(_text_:des in 2091) [ClassicSimilarity], result of:
          0.028025504 = score(doc=2091,freq=2.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.2141777 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.11969343 = weight(_text_:prinzips in 2091) [ClassicSimilarity], result of:
          0.11969343 = score(doc=2091,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.44262168 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.09557479 = product of:
          0.19114958 = sum of:
            0.19114958 = weight(_text_:thesaurus in 2091) [ClassicSimilarity], result of:
              0.19114958 = score(doc=2091,freq=12.0), product of:
                0.21834905 = queryWeight, product of:
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.04725067 = queryNorm
                0.8754312 = fieldWeight in 2091, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2091)
          0.5 = coord(1/2)
      0.8333333 = coord(5/6)
    
    Abstract
    During the BMWI granted project "Print-IT", the need of a thesaurus based uniform and consistent language for the German printing industry became evident. In this paper we introduce a semi-automatic construction approach for such a thesaurus and present a workflow which supports users to generate thesaurus typical information structures from relevant digitalized resources with the help of common IT-tools.
    Object
    MIDOS Thesaurus
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  4. Milstead, J.L.: Thesauri in a full-text world (1998) 0.22
    0.2227245 = product of:
      0.2672694 = sum of:
        0.012822312 = weight(_text_:und in 2337) [ClassicSimilarity], result of:
          0.012822312 = score(doc=2337,freq=2.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.12243814 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.061184157 = weight(_text_:anwendung in 2337) [ClassicSimilarity], result of:
          0.061184157 = score(doc=2337,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.2674564 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.020018218 = weight(_text_:des in 2337) [ClassicSimilarity], result of:
          0.020018218 = score(doc=2337,freq=2.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.15298408 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.08549531 = weight(_text_:prinzips in 2337) [ClassicSimilarity], result of:
          0.08549531 = score(doc=2337,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.31615835 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.087749414 = sum of:
          0.055740345 = weight(_text_:thesaurus in 2337) [ClassicSimilarity], result of:
            0.055740345 = score(doc=2337,freq=2.0), product of:
              0.21834905 = queryWeight, product of:
                4.6210785 = idf(docFreq=1182, maxDocs=44218)
                0.04725067 = queryNorm
              0.2552809 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6210785 = idf(docFreq=1182, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
          0.03200907 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
            0.03200907 = score(doc=2337,freq=2.0), product of:
              0.16546379 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04725067 = queryNorm
              0.19345059 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
      0.8333333 = coord(5/6)
    
    Date
    22. 9.1997 19:16:05
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  5. Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.19
    0.19344495 = product of:
      0.23213395 = sum of:
        0.03243817 = weight(_text_:und in 3081) [ClassicSimilarity], result of:
          0.03243817 = score(doc=3081,freq=20.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.3097467 = fieldWeight in 3081, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.048947323 = weight(_text_:anwendung in 3081) [ClassicSimilarity], result of:
          0.048947323 = score(doc=3081,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.21396513 = fieldWeight in 3081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.027738057 = weight(_text_:des in 3081) [ClassicSimilarity], result of:
          0.027738057 = score(doc=3081,freq=6.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.21198097 = fieldWeight in 3081, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.06839625 = weight(_text_:prinzips in 3081) [ClassicSimilarity], result of:
          0.06839625 = score(doc=3081,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.25292668 = fieldWeight in 3081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.054614164 = product of:
          0.10922833 = sum of:
            0.10922833 = weight(_text_:thesaurus in 3081) [ClassicSimilarity], result of:
              0.10922833 = score(doc=3081,freq=12.0), product of:
                0.21834905 = queryWeight, product of:
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.04725067 = queryNorm
                0.5002464 = fieldWeight in 3081, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3081)
          0.5 = coord(1/2)
      0.8333333 = coord(5/6)
    
    Abstract
    Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."
    Footnote
    Dissertation, Humboldt-Universität zu Berlin - Institut für Bibliotheks- und Informationswissenschaft.
    Imprint
    Berlin : Humboldt-Universität zu Berlin / Institut für Bibliotheks- und Informationswissenschaft
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  6. Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.17
    0.16883837 = product of:
      0.20260604 = sum of:
        0.01025785 = weight(_text_:und in 1016) [ClassicSimilarity], result of:
          0.01025785 = score(doc=1016,freq=2.0), product of:
            0.104724824 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.04725067 = queryNorm
            0.09795051 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.048947323 = weight(_text_:anwendung in 1016) [ClassicSimilarity], result of:
          0.048947323 = score(doc=1016,freq=2.0), product of:
            0.22876309 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.04725067 = queryNorm
            0.21396513 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.016014574 = weight(_text_:des in 1016) [ClassicSimilarity], result of:
          0.016014574 = score(doc=1016,freq=2.0), product of:
            0.13085164 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.04725067 = queryNorm
            0.12238726 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.06839625 = weight(_text_:prinzips in 1016) [ClassicSimilarity], result of:
          0.06839625 = score(doc=1016,freq=2.0), product of:
            0.27041927 = queryWeight, product of:
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.04725067 = queryNorm
            0.25292668 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.723078 = idf(docFreq=392, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.058990043 = product of:
          0.117980085 = sum of:
            0.117980085 = weight(_text_:thesaurus in 1016) [ClassicSimilarity], result of:
              0.117980085 = score(doc=1016,freq=14.0), product of:
                0.21834905 = queryWeight, product of:
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.04725067 = queryNorm
                0.5403279 = fieldWeight in 1016, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  4.6210785 = idf(docFreq=1182, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1016)
          0.5 = coord(1/2)
      0.8333333 = coord(5/6)
    
    Abstract
    Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre-indexed with 4 different thesauri (AGROVOC [UN Food and Agriculture Organization], high-energy physics taxonomy [HEP], National Agricultural Library Thesaurus [NALT], and medical subject headings [MeSH]). We also introduce a thesaurus-centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP) of 9% for HEP, 11% for MeSH, 35% for NALT, and 37% for AGROVOC. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus