Document (#34677)

Author
Reiner, U.
Title
VZG-Projekt Colibri : Bewertung von automatisch DDC-klassifizierten Titeldatensätzen der Deutschen Nationalbibliothek (DNB)
Issue
August 2008 - Februar 2009.
Imprint
Göttingen : Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG)
Year
2009
Pages
111 S
Series
VZG-Colibri-Bericht 1/2008
Abstract
Das VZG-Projekt Colibri/DDC beschäftigt sich seit 2003 mit automatischen Verfahren zur Dewey-Dezimalklassifikation (Dewey Decimal Classification, kurz DDC). Ziel des Projektes ist eine einheitliche DDC-Erschließung von bibliografischen Titeldatensätzen und eine Unterstützung der DDC-Expert(inn)en und DDC-Laien, z. B. bei der Analyse und Synthese von DDC-Notationen und deren Qualitätskontrolle und der DDC-basierten Suche. Der vorliegende Bericht konzentriert sich auf die erste größere automatische DDC-Klassifizierung und erste automatische und intellektuelle Bewertung mit der Klassifizierungskomponente vc_dcl1. Grundlage hierfür waren die von der Deutschen Nationabibliothek (DNB) im November 2007 zur Verfügung gestellten 25.653 Titeldatensätze (12 Wochen-/Monatslieferungen) der Deutschen Nationalbibliografie der Reihen A, B und H. Nach Erläuterung der automatischen DDC-Klassifizierung und automatischen Bewertung in Kapitel 2 wird in Kapitel 3 auf den DNB-Bericht "Colibri_Auswertung_DDC_Endbericht_Sommer_2008" eingegangen. Es werden Sachverhalte geklärt und Fragen gestellt, deren Antworten die Weichen für den Verlauf der weiteren Klassifizierungstests stellen werden. Über das Kapitel 3 hinaus führende weitergehende Betrachtungen und Gedanken zur Fortführung der automatischen DDC-Klassifizierung werden in Kapitel 4 angestellt. Der Bericht dient dem vertieften Verständnis für die automatischen Verfahren.
Content
Vgl. unter; http://taipan.dyndns.org/~ul/colibri05.pdf.
Theme
Automatisches Klassifizieren
Object
Colibri
DDC

Similar documents (author)

  1. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 5.73
    5.7298613 = sum of:
      5.7298613 = weight(author_txt:reiner in 1612) [ClassicSimilarity], result of:
        5.7298613 = fieldWeight in 1612, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.167778 = idf(docFreq=11, maxDocs=42306)
          0.625 = fieldNorm(doc=1612)
    
  2. Reiner, U.: Anfragesprachen für Informationssysteme (1991) 5.73
    5.7298613 = sum of:
      5.7298613 = weight(author_txt:reiner in 554) [ClassicSimilarity], result of:
        5.7298613 = fieldWeight in 554, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.167778 = idf(docFreq=11, maxDocs=42306)
          0.625 = fieldNorm(doc=554)
    
  3. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 5.73
    5.7298613 = sum of:
      5.7298613 = weight(author_txt:reiner in 855) [ClassicSimilarity], result of:
        5.7298613 = fieldWeight in 855, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.167778 = idf(docFreq=11, maxDocs=42306)
          0.625 = fieldNorm(doc=855)
    
  4. Reiner, U.: Automatic analysis of DDC notations (2007) 5.73
    5.7298613 = sum of:
      5.7298613 = weight(author_txt:reiner in 2119) [ClassicSimilarity], result of:
        5.7298613 = fieldWeight in 2119, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.167778 = idf(docFreq=11, maxDocs=42306)
          0.625 = fieldNorm(doc=2119)
    
  5. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 5.73
    5.7298613 = sum of:
      5.7298613 = weight(author_txt:reiner in 4167) [ClassicSimilarity], result of:
        5.7298613 = fieldWeight in 4167, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.167778 = idf(docFreq=11, maxDocs=42306)
          0.625 = fieldNorm(doc=4167)
    

Similar documents (content)

  1. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.48
    0.48116818 = sum of:
      0.48116818 = product of:
        1.2029204 = sum of:
          0.06740054 = weight(abstract_txt:titeldatensätze in 285) [ClassicSimilarity], result of:
            0.06740054 = score(doc=285,freq=1.0), product of:
              0.15206856 = queryWeight, product of:
                1.0984296 = boost
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.014641467 = queryNorm
              0.44322467 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.023532134 = weight(abstract_txt:werden in 285) [ClassicSimilarity], result of:
            0.023532134 = score(doc=285,freq=5.0), product of:
              0.063594826 = queryWeight, product of:
                1.2303368 = boost
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.014641467 = queryNorm
              0.3700322 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.030315205 = weight(abstract_txt:projekt in 285) [ClassicSimilarity], result of:
            0.030315205 = score(doc=285,freq=1.0), product of:
              0.11247281 = queryWeight, product of:
                1.3359532 = boost
                5.750051 = idf(docFreq=365, maxDocs=42306)
                0.014641467 = queryNorm
              0.26953363 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.750051 = idf(docFreq=365, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.061510943 = weight(abstract_txt:verfahren in 285) [ClassicSimilarity], result of:
            0.061510943 = score(doc=285,freq=4.0), product of:
              0.11355915 = queryWeight, product of:
                1.3423896 = boost
                5.7777534 = idf(docFreq=355, maxDocs=42306)
                0.014641467 = queryNorm
              0.54166436 = fieldWeight in 285, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7777534 = idf(docFreq=355, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.032032292 = weight(abstract_txt:dewey in 285) [ClassicSimilarity], result of:
            0.032032292 = score(doc=285,freq=1.0), product of:
              0.11668075 = queryWeight, product of:
                1.3607148 = boost
                5.8566265 = idf(docFreq=328, maxDocs=42306)
                0.014641467 = queryNorm
              0.27452937 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8566265 = idf(docFreq=328, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.03261174 = weight(abstract_txt:deutschen in 285) [ClassicSimilarity], result of:
            0.03261174 = score(doc=285,freq=1.0), product of:
              0.13517205 = queryWeight, product of:
                1.793728 = boost
                5.1469 = idf(docFreq=668, maxDocs=42306)
                0.014641467 = queryNorm
              0.24126095 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1469 = idf(docFreq=668, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.21622874 = weight(abstract_txt:titeldatensätzen in 285) [ClassicSimilarity], result of:
            0.21622874 = score(doc=285,freq=2.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              0.65369314 = fieldWeight in 285, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.1528968 = weight(abstract_txt:colibri in 285) [ClassicSimilarity], result of:
            0.1528968 = score(doc=285,freq=1.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              0.46223086 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.29691392 = weight(abstract_txt:klassifizierung in 285) [ClassicSimilarity], result of:
            0.29691392 = score(doc=285,freq=5.0), product of:
              0.34466827 = queryWeight, product of:
                2.8642688 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.014641467 = queryNorm
              0.8614484 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
          0.28947818 = weight(abstract_txt:automatischen in 285) [ClassicSimilarity], result of:
            0.28947818 = score(doc=285,freq=5.0), product of:
              0.40179798 = queryWeight, product of:
                3.9924674 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.014641467 = queryNorm
              0.720457 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.046875 = fieldNorm(doc=285)
        0.4 = coord(10/25)
    
  2. Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.24
    0.2405356 = sum of:
      0.2405356 = product of:
        1.202678 = sum of:
          0.15726791 = weight(abstract_txt:titeldatensätze in 2123) [ClassicSimilarity], result of:
            0.15726791 = score(doc=2123,freq=1.0), product of:
              0.15206856 = queryWeight, product of:
                1.0984296 = boost
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.014641467 = queryNorm
              1.0341909 = fieldWeight in 2123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.109375 = fieldNorm(doc=2123)
          0.07609405 = weight(abstract_txt:deutschen in 2123) [ClassicSimilarity], result of:
            0.07609405 = score(doc=2123,freq=1.0), product of:
              0.13517205 = queryWeight, product of:
                1.793728 = boost
                5.1469 = idf(docFreq=668, maxDocs=42306)
                0.014641467 = queryNorm
              0.5629422 = fieldWeight in 2123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1469 = idf(docFreq=668, maxDocs=42306)
                0.109375 = fieldNorm(doc=2123)
          0.35675922 = weight(abstract_txt:titeldatensätzen in 2123) [ClassicSimilarity], result of:
            0.35675922 = score(doc=2123,freq=1.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              1.0785387 = fieldWeight in 2123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.109375 = fieldNorm(doc=2123)
          0.18536532 = weight(abstract_txt:bewertung in 2123) [ClassicSimilarity], result of:
            0.18536532 = score(doc=2123,freq=1.0), product of:
              0.24472147 = queryWeight, product of:
                2.4135103 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.014641467 = queryNorm
              0.7574543 = fieldWeight in 2123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.109375 = fieldNorm(doc=2123)
          0.42719153 = weight(abstract_txt:automatischen in 2123) [ClassicSimilarity], result of:
            0.42719153 = score(doc=2123,freq=2.0), product of:
              0.40179798 = queryWeight, product of:
                3.9924674 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.014641467 = queryNorm
              1.0631998 = fieldWeight in 2123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.109375 = fieldNorm(doc=2123)
        0.2 = coord(5/25)
    
  3. Balakrishnan, U.; Krausz, A,; Voss, J.: Cocoda - ein Konkordanztool für bibliothekarische Klassifikationssysteme (2015) 0.16
    0.15931748 = sum of:
      0.15931748 = product of:
        0.7965874 = sum of:
          0.02430388 = weight(abstract_txt:werden in 4031) [ClassicSimilarity], result of:
            0.02430388 = score(doc=4031,freq=3.0), product of:
              0.063594826 = queryWeight, product of:
                1.2303368 = boost
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.014641467 = queryNorm
              0.38216758 = fieldWeight in 4031, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.0625 = fieldNorm(doc=4031)
          0.042709723 = weight(abstract_txt:dewey in 4031) [ClassicSimilarity], result of:
            0.042709723 = score(doc=4031,freq=1.0), product of:
              0.11668075 = queryWeight, product of:
                1.3607148 = boost
                5.8566265 = idf(docFreq=328, maxDocs=42306)
                0.014641467 = queryNorm
              0.36603916 = fieldWeight in 4031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8566265 = idf(docFreq=328, maxDocs=42306)
                0.0625 = fieldNorm(doc=4031)
          0.2038624 = weight(abstract_txt:titeldatensätzen in 4031) [ClassicSimilarity], result of:
            0.2038624 = score(doc=4031,freq=1.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              0.6163078 = fieldWeight in 4031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.0625 = fieldNorm(doc=4031)
          0.35310003 = weight(abstract_txt:colibri in 4031) [ClassicSimilarity], result of:
            0.35310003 = score(doc=4031,freq=3.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              1.0674764 = fieldWeight in 4031, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.0625 = fieldNorm(doc=4031)
          0.17261143 = weight(abstract_txt:automatischen in 4031) [ClassicSimilarity], result of:
            0.17261143 = score(doc=4031,freq=1.0), product of:
              0.40179798 = queryWeight, product of:
                3.9924674 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.014641467 = queryNorm
              0.42959756 = fieldWeight in 4031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.0625 = fieldNorm(doc=4031)
        0.2 = coord(5/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.14837217 = sum of:
      0.14837217 = product of:
        0.74186087 = sum of:
          0.08986738 = weight(abstract_txt:titeldatensätze in 855) [ClassicSimilarity], result of:
            0.08986738 = score(doc=855,freq=1.0), product of:
              0.15206856 = queryWeight, product of:
                1.0984296 = boost
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.014641467 = queryNorm
              0.5909662 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.45546 = idf(docFreq=8, maxDocs=42306)
                0.0625 = fieldNorm(doc=855)
          0.014031853 = weight(abstract_txt:werden in 855) [ClassicSimilarity], result of:
            0.014031853 = score(doc=855,freq=1.0), product of:
              0.063594826 = queryWeight, product of:
                1.2303368 = boost
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.014641467 = queryNorm
              0.22064456 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.0625 = fieldNorm(doc=855)
          0.28830498 = weight(abstract_txt:colibri in 855) [ClassicSimilarity], result of:
            0.28830498 = score(doc=855,freq=2.0), product of:
              0.33078018 = queryWeight, product of:
                2.291064 = boost
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.014641467 = queryNorm
              0.87159085 = fieldWeight in 855, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.860925 = idf(docFreq=5, maxDocs=42306)
                0.0625 = fieldNorm(doc=855)
          0.17704524 = weight(abstract_txt:klassifizierung in 855) [ClassicSimilarity], result of:
            0.17704524 = score(doc=855,freq=1.0), product of:
              0.34466827 = queryWeight, product of:
                2.8642688 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.014641467 = queryNorm
              0.51366854 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.0625 = fieldNorm(doc=855)
          0.17261143 = weight(abstract_txt:automatischen in 855) [ClassicSimilarity], result of:
            0.17261143 = score(doc=855,freq=1.0), product of:
              0.40179798 = queryWeight, product of:
                3.9924674 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.014641467 = queryNorm
              0.42959756 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.0625 = fieldNorm(doc=855)
        0.2 = coord(5/25)
    
  5. Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.13
    0.12997755 = sum of:
      0.12997755 = product of:
        0.64988774 = sum of:
          0.035079632 = weight(abstract_txt:werden in 3411) [ClassicSimilarity], result of:
            0.035079632 = score(doc=3411,freq=4.0), product of:
              0.063594826 = queryWeight, product of:
                1.2303368 = boost
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.014641467 = queryNorm
              0.5516114 = fieldWeight in 3411, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.530313 = idf(docFreq=3368, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.088783406 = weight(abstract_txt:verfahren in 3411) [ClassicSimilarity], result of:
            0.088783406 = score(doc=3411,freq=3.0), product of:
              0.11355915 = queryWeight, product of:
                1.3423896 = boost
                5.7777534 = idf(docFreq=355, maxDocs=42306)
                0.014641467 = queryNorm
              0.7818252 = fieldWeight in 3411, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7777534 = idf(docFreq=355, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.088953815 = weight(abstract_txt:automatische in 3411) [ClassicSimilarity], result of:
            0.088953815 = score(doc=3411,freq=1.0), product of:
              0.16399014 = queryWeight, product of:
                1.613156 = boost
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.014641467 = queryNorm
              0.5424339 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.22130656 = weight(abstract_txt:klassifizierung in 3411) [ClassicSimilarity], result of:
            0.22130656 = score(doc=3411,freq=1.0), product of:
              0.34466827 = queryWeight, product of:
                2.8642688 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.014641467 = queryNorm
              0.6420857 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.2157643 = weight(abstract_txt:automatischen in 3411) [ClassicSimilarity], result of:
            0.2157643 = score(doc=3411,freq=1.0), product of:
              0.40179798 = queryWeight, product of:
                3.9924674 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.014641467 = queryNorm
              0.53699696 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
        0.2 = coord(5/25)