Search (7 results, page 1 of 1)

Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.21
```
0.20533463 = product of:
  0.2737795 = sum of:
    0.061978623 = product of:
      0.18593587 = sum of:
        0.18593587 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
          0.18593587 = score(doc=1000,freq=2.0), product of:
            0.39700332 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046827413 = queryNorm
            0.46834838 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.33333334 = coord(1/3)
    0.18593587 = weight(_text_:2f in 1000) [ClassicSimilarity], result of:
      0.18593587 = score(doc=1000,freq=2.0), product of:
        0.39700332 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046827413 = queryNorm
        0.46834838 = fieldWeight in 1000, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1000)
    0.02586502 = weight(_text_:data in 1000) [ClassicSimilarity], result of:
      0.02586502 = score(doc=1000,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 1000, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1000)
  0.75 = coord(3/4)
```
Abstract

Vorgestellt wird die Konstruktion eines thematisch geordneten Thesaurus auf Basis der Sachschlagwörter der Gemeinsamen Normdatei (GND) unter Nutzung der darin enthaltenen DDC-Notationen. Oberste Ordnungsebene dieses Thesaurus werden die DDC-Sachgruppen der Deutschen Nationalbibliothek. Die Konstruktion des Thesaurus erfolgt regelbasiert unter der Nutzung von Linked Data Prinzipien in einem SPARQL Prozessor. Der Thesaurus dient der automatisierten Gewinnung von Metadaten aus wissenschaftlichen Publikationen mittels eines computerlinguistischen Extraktors. Hierzu werden digitale Volltexte verarbeitet. Dieser ermittelt die gefundenen Schlagwörter über Vergleich der Zeichenfolgen Benennungen im Thesaurus, ordnet die Treffer nach Relevanz im Text und gibt die zugeordne-ten Sachgruppen rangordnend zurück. Die grundlegende Annahme dabei ist, dass die gesuchte Sachgruppe unter den oberen Rängen zurückgegeben wird. In einem dreistufigen Verfahren wird die Leistungsfähigkeit des Verfahrens validiert. Hierzu wird zunächst anhand von Metadaten und Erkenntnissen einer Kurzautopsie ein Goldstandard aus Dokumenten erstellt, die im Online-Katalog der DNB abrufbar sind. Die Dokumente vertei-len sich über 14 der Sachgruppen mit einer Losgröße von jeweils 50 Dokumenten. Sämtliche Dokumente werden mit dem Extraktor erschlossen und die Ergebnisse der Kategorisierung do-kumentiert. Schließlich wird die sich daraus ergebende Retrievalleistung sowohl für eine harte (binäre) Kategorisierung als auch eine rangordnende Rückgabe der Sachgruppen beurteilt.

Content

Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.

Scott, D.S.: Subject classification and natural-language processing for retrieval in large databases (1989) 0.01

0.011990375 = product of:
  0.0479615 = sum of:
    0.0479615 = product of:
      0.095923 = sum of:
        0.095923 = weight(_text_:processing in 967) [ClassicSimilarity], result of:
          0.095923 = score(doc=967,freq=4.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.5060184 = fieldWeight in 967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=967)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Abstract: New forms of man-machine interaction are becoming available that have great power for the delivery of information. But the scales of speed and capacity on which the computing machines operate demand new thoughts as to how information can be stored and retrieved. The objective of the discussion in this paper is to argue for a combination of natural-language processing and subject classification to be able to meet the demands

Aitchison, J.: ¬A classification as a source for a thesaurus : the bibliographic classification of H.E. Bliss as a source of thesaurus terms and structure (1986) 0.01
```
0.010973599 = product of:
  0.043894395 = sum of:
    0.043894395 = weight(_text_:data in 1569) [ClassicSimilarity], result of:
      0.043894395 = score(doc=1569,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.29644224 = fieldWeight in 1569, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1569)
  0.25 = coord(1/4)
```
Abstract

The second edition of the Bibliographic Classidication of H.E. Bliss (BC2), being prepared under the editorship of Jack Mills, Vanda Broughton and others, is a rich source of structure and terminology for thesauri covering different subject fields. The new edition employs facet analysis and is thesaurus-compatible. A number of facet-based thesauri have drawn upon Bliss for terms and relationships. In two of these thesauri the Bliss Classification was the source of both systematic and alphabetical displays. The DHSS-DATA thesaurus, published by the United Kingdom Department of Health and Social Security, provides controlled terms and Bliss class numbers for indexing and searching the DHSS-DATA database. The ECOT thesaurus (Educational courses and occupations thesaurus) prepared for the Department of Education and Science, uses the software sedigned for the British Standards Institution ROOT thesaurus to genearte an alphabetical display from the systematic display derived from the Bliss schedules. Problems, benefits, and future prospects of Bliss-based thesaurus construction are discussed

Root thesaurus. Pt.1.2 (1985) 0.01

0.009516701 = product of:
  0.038066804 = sum of:
    0.038066804 = product of:
      0.07613361 = sum of:
        0.07613361 = weight(_text_:22 in 467) [ClassicSimilarity], result of:
          0.07613361 = score(doc=467,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.46428138 = fieldWeight in 467, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=467)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 18. 5.2007 14:22:43

Karg, H.: Mapping Dewey and subject authorities : CrissCross (2007) 0.01

0.009516701 = product of:
  0.038066804 = sum of:
    0.038066804 = product of:
      0.07613361 = sum of:
        0.07613361 = weight(_text_:22 in 559) [ClassicSimilarity], result of:
          0.07613361 = score(doc=559,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.46428138 = fieldWeight in 559, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=559)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Content: Vortrag anläasslich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Green, R.: Making visible hidden relationships in the Dewey Decimal Classification : how relative index terms relate to DDC classes (2008) 0.01

0.007418666 = product of:
  0.029674664 = sum of:
    0.029674664 = product of:
      0.05934933 = sum of:
        0.05934933 = weight(_text_:processing in 2236) [ClassicSimilarity], result of:
          0.05934933 = score(doc=2236,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.3130829 = fieldWeight in 2236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2236)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Content: Relative Index (RI) terms in the Dewey Decimal Classification (DDC) system correspond to concepts that either approximate the whole of the class they index or that are in standing room there. DDC conventions and shallow natural language processing are used to determine automatically whether specific RI terms approximate the whole of or are in standing room in the classes they index. Approximately three-quarters of all RI terms are processed by the techniques described.

Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: FACET: thesaurus retrieval with semantic term expansion (2002) 0.01
```
0.0051730038 = product of:
  0.020692015 = sum of:
    0.020692015 = weight(_text_:data in 175) [ClassicSimilarity], result of:
      0.020692015 = score(doc=175,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.1397442 = fieldWeight in 175, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=175)
  0.25 = coord(1/4)
```
Abstract

There are many advantages for Digital Libraries in indexing with classifications or thesauri, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors. This demonstration of a research prototype illustrates a matching function for compound descriptors, or multi-concept subject headings, that does not rely on exact matching but incorporates term expansion via thesaurus semantic relationships to produce ranked results that take account of missing and partially matching terms. The matching function is based on a measure of semantic closeness between terms.The work is part of the EPSRC funded FACET project in collaboration with the UK National Museum of Science and Industry (NMSI) which includes the National Railway Museum. An export of NMSI's Collections Database is used as the dataset for the research. The J. Paul Getty Trust's Art and Architecture Thesaurus (AAT) is the main thesaurus in the project. The AAT is a widely used thesaurus (over 120,000 terms). Descriptors are organised in 7 facets representing separate conceptual classes of terms.The FACET application is a multi tiered architecture accessing a SQL Server database, with an OLE DB connection. The thesauri are stored as relational tables in the Server's database. However, a key component of the system is a parallel representation of the underlying semantic network as an in-memory structure of thesaurus concepts (corresponding to preferred terms). The structure models the hierarchical and associative interrelationships of thesaurus concepts via weighted poly-hierarchical links. Its primary purpose is real-time semantic expansion of query terms, achieved by a spreading activation semantic closeness algorithm. Queries with associated results are stored persistently using XML format data. A Visual Basic interface combines a thesaurus browser and an initial term search facility that takes into account equivalence relationships. Terms are dragged to a direct manipulation Query Builder which maintains the facet structure.

Search (7 results, page 1 of 1)

Authors

Years

Languages

Types

Themes

Subjects