Document (#34677)

Author
Reiner, U.
Title
VZG-Projekt Colibri : Bewertung von automatisch DDC-klassifizierten Titeldatensätzen der Deutschen Nationalbibliothek (DNB)
Issue
August 2008 - Februar 2009.
Imprint
Göttingen : Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG)
Year
2009
Pages
111 S
Series
VZG-Colibri-Bericht 1/2008
Abstract
Das VZG-Projekt Colibri/DDC beschäftigt sich seit 2003 mit automatischen Verfahren zur Dewey-Dezimalklassifikation (Dewey Decimal Classification, kurz DDC). Ziel des Projektes ist eine einheitliche DDC-Erschließung von bibliografischen Titeldatensätzen und eine Unterstützung der DDC-Expert(inn)en und DDC-Laien, z. B. bei der Analyse und Synthese von DDC-Notationen und deren Qualitätskontrolle und der DDC-basierten Suche. Der vorliegende Bericht konzentriert sich auf die erste größere automatische DDC-Klassifizierung und erste automatische und intellektuelle Bewertung mit der Klassifizierungskomponente vc_dcl1. Grundlage hierfür waren die von der Deutschen Nationabibliothek (DNB) im November 2007 zur Verfügung gestellten 25.653 Titeldatensätze (12 Wochen-/Monatslieferungen) der Deutschen Nationalbibliografie der Reihen A, B und H. Nach Erläuterung der automatischen DDC-Klassifizierung und automatischen Bewertung in Kapitel 2 wird in Kapitel 3 auf den DNB-Bericht "Colibri_Auswertung_DDC_Endbericht_Sommer_2008" eingegangen. Es werden Sachverhalte geklärt und Fragen gestellt, deren Antworten die Weichen für den Verlauf der weiteren Klassifizierungstests stellen werden. Über das Kapitel 3 hinaus führende weitergehende Betrachtungen und Gedanken zur Fortführung der automatischen DDC-Klassifizierung werden in Kapitel 4 angestellt. Der Bericht dient dem vertieften Verständnis für die automatischen Verfahren.
Content
Vgl. unter; http://taipan.dyndns.org/~ul/colibri05.pdf.
Theme
Automatisches Klassifizieren
Object
Colibri
DDC

Similar documents (author)

  1. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 5.73
    5.734131 = sum of:
      5.734131 = weight(author_txt:reiner in 1612) [ClassicSimilarity], result of:
        5.734131 = fieldWeight in 1612, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.174609 = idf(docFreq=11, maxDocs=42596)
          0.625 = fieldNorm(doc=1612)
    
  2. Reiner, U.: Anfragesprachen für Informationssysteme (1991) 5.73
    5.734131 = sum of:
      5.734131 = weight(author_txt:reiner in 5554) [ClassicSimilarity], result of:
        5.734131 = fieldWeight in 5554, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.174609 = idf(docFreq=11, maxDocs=42596)
          0.625 = fieldNorm(doc=5554)
    
  3. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 5.73
    5.734131 = sum of:
      5.734131 = weight(author_txt:reiner in 5855) [ClassicSimilarity], result of:
        5.734131 = fieldWeight in 5855, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.174609 = idf(docFreq=11, maxDocs=42596)
          0.625 = fieldNorm(doc=5855)
    
  4. Reiner, U.: Automatic analysis of DDC notations (2007) 5.73
    5.734131 = sum of:
      5.734131 = weight(author_txt:reiner in 1298) [ClassicSimilarity], result of:
        5.734131 = fieldWeight in 1298, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.174609 = idf(docFreq=11, maxDocs=42596)
          0.625 = fieldNorm(doc=1298)
    
  5. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 5.73
    5.734131 = sum of:
      5.734131 = weight(author_txt:reiner in 3346) [ClassicSimilarity], result of:
        5.734131 = fieldWeight in 3346, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.174609 = idf(docFreq=11, maxDocs=42596)
          0.625 = fieldNorm(doc=3346)
    

Similar documents (content)

  1. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.48
    0.48179522 = sum of:
      0.48179522 = product of:
        1.204488 = sum of:
          0.06748976 = weight(abstract_txt:titeldatensätze in 4464) [ClassicSimilarity], result of:
            0.06748976 = score(doc=4464,freq=1.0), product of:
              0.15215993 = queryWeight, product of:
                1.0983515 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.014640728 = queryNorm
              0.44354486 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.023471672 = weight(abstract_txt:werden in 4464) [ClassicSimilarity], result of:
            0.023471672 = score(doc=4464,freq=5.0), product of:
              0.063468 = queryWeight, product of:
                1.228653 = boost
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.014640728 = queryNorm
              0.369819 = fieldWeight in 4464, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.030397722 = weight(abstract_txt:projekt in 4464) [ClassicSimilarity], result of:
            0.030397722 = score(doc=4464,freq=1.0), product of:
              0.11264513 = queryWeight, product of:
                1.3364798 = boost
                5.756882 = idf(docFreq=365, maxDocs=42596)
                0.014640728 = queryNorm
              0.26985386 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.756882 = idf(docFreq=365, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.061498314 = weight(abstract_txt:verfahren in 4464) [ClassicSimilarity], result of:
            0.061498314 = score(doc=4464,freq=4.0), product of:
              0.113511674 = queryWeight, product of:
                1.3416106 = boost
                5.7789826 = idf(docFreq=357, maxDocs=42596)
                0.014640728 = queryNorm
              0.54177964 = fieldWeight in 4464, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7789826 = idf(docFreq=357, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.03191923 = weight(abstract_txt:dewey in 4464) [ClassicSimilarity], result of:
            0.03191923 = score(doc=4464,freq=1.0), product of:
              0.11637329 = queryWeight, product of:
                1.3584162 = boost
                5.851373 = idf(docFreq=332, maxDocs=42596)
                0.014640728 = queryNorm
              0.2742831 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.851373 = idf(docFreq=332, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.032629017 = weight(abstract_txt:deutschen in 4464) [ClassicSimilarity], result of:
            0.032629017 = score(doc=4464,freq=1.0), product of:
              0.13518177 = queryWeight, product of:
                1.7931263 = boost
                5.149257 = idf(docFreq=671, maxDocs=42596)
                0.014640728 = queryNorm
              0.24137142 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.149257 = idf(docFreq=671, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.21649566 = weight(abstract_txt:titeldatensätzen in 4464) [ClassicSimilarity], result of:
            0.21649566 = score(doc=4464,freq=2.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              0.65414596 = fieldWeight in 4464, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.15308556 = weight(abstract_txt:colibri in 4464) [ClassicSimilarity], result of:
            0.15308556 = score(doc=4464,freq=1.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              0.46255106 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.29740396 = weight(abstract_txt:klassifizierung in 4464) [ClassicSimilarity], result of:
            0.29740396 = score(doc=4464,freq=5.0), product of:
              0.3449504 = queryWeight, product of:
                2.8643768 = boost
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.014640728 = queryNorm
              0.86216444 = fieldWeight in 4464, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.29009727 = weight(abstract_txt:automatischen in 4464) [ClassicSimilarity], result of:
            0.29009727 = score(doc=4464,freq=5.0), product of:
              0.4022575 = queryWeight, product of:
                3.9932663 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.014640728 = queryNorm
              0.72117305 = fieldWeight in 4464, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
        0.4 = coord(10/25)
    
  2. Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.24
    0.24079311 = sum of:
      0.24079311 = product of:
        1.2039655 = sum of:
          0.15747611 = weight(abstract_txt:titeldatensätze in 1123) [ClassicSimilarity], result of:
            0.15747611 = score(doc=1123,freq=1.0), product of:
              0.15215993 = queryWeight, product of:
                1.0983515 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.014640728 = queryNorm
              1.0349381 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.109375 = fieldNorm(doc=1123)
          0.07613437 = weight(abstract_txt:deutschen in 1123) [ClassicSimilarity], result of:
            0.07613437 = score(doc=1123,freq=1.0), product of:
              0.13518177 = queryWeight, product of:
                1.7931263 = boost
                5.149257 = idf(docFreq=671, maxDocs=42596)
                0.014640728 = queryNorm
              0.5632 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.149257 = idf(docFreq=671, maxDocs=42596)
                0.109375 = fieldNorm(doc=1123)
          0.35719964 = weight(abstract_txt:titeldatensätzen in 1123) [ClassicSimilarity], result of:
            0.35719964 = score(doc=1123,freq=1.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              1.0792859 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.109375 = fieldNorm(doc=1123)
          0.18505028 = weight(abstract_txt:bewertung in 1123) [ClassicSimilarity], result of:
            0.18505028 = score(doc=1123,freq=1.0), product of:
              0.24437538 = queryWeight, product of:
                2.4109075 = boost
                6.923317 = idf(docFreq=113, maxDocs=42596)
                0.014640728 = queryNorm
              0.7572378 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.923317 = idf(docFreq=113, maxDocs=42596)
                0.109375 = fieldNorm(doc=1123)
          0.4281051 = weight(abstract_txt:automatischen in 1123) [ClassicSimilarity], result of:
            0.4281051 = score(doc=1123,freq=2.0), product of:
              0.4022575 = queryWeight, product of:
                3.9932663 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.014640728 = queryNorm
              1.0642563 = fieldWeight in 1123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.109375 = fieldNorm(doc=1123)
        0.2 = coord(5/25)
    
  3. Balakrishnan, U.; Krausz, A,; Voss, J.: Cocoda - ein Konkordanztool für bibliothekarische Klassifikationssysteme (2015) 0.16
    0.1594862 = sum of:
      0.1594862 = product of:
        0.797431 = sum of:
          0.024241438 = weight(abstract_txt:werden in 3031) [ClassicSimilarity], result of:
            0.024241438 = score(doc=3031,freq=3.0), product of:
              0.063468 = queryWeight, product of:
                1.228653 = boost
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.014640728 = queryNorm
              0.3819474 = fieldWeight in 3031, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.0625 = fieldNorm(doc=3031)
          0.04255897 = weight(abstract_txt:dewey in 3031) [ClassicSimilarity], result of:
            0.04255897 = score(doc=3031,freq=1.0), product of:
              0.11637329 = queryWeight, product of:
                1.3584162 = boost
                5.851373 = idf(docFreq=332, maxDocs=42596)
                0.014640728 = queryNorm
              0.36571082 = fieldWeight in 3031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.851373 = idf(docFreq=332, maxDocs=42596)
                0.0625 = fieldNorm(doc=3031)
          0.20411408 = weight(abstract_txt:titeldatensätzen in 3031) [ClassicSimilarity], result of:
            0.20411408 = score(doc=3031,freq=1.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              0.61673474 = fieldWeight in 3031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.0625 = fieldNorm(doc=3031)
          0.35353592 = weight(abstract_txt:colibri in 3031) [ClassicSimilarity], result of:
            0.35353592 = score(doc=3031,freq=3.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              1.0682158 = fieldWeight in 3031, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.0625 = fieldNorm(doc=3031)
          0.17298058 = weight(abstract_txt:automatischen in 3031) [ClassicSimilarity], result of:
            0.17298058 = score(doc=3031,freq=1.0), product of:
              0.4022575 = queryWeight, product of:
                3.9932663 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.014640728 = queryNorm
              0.4300245 = fieldWeight in 3031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.0625 = fieldNorm(doc=3031)
        0.2 = coord(5/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.14859222 = sum of:
      0.14859222 = product of:
        0.74296105 = sum of:
          0.08998635 = weight(abstract_txt:titeldatensätze in 5855) [ClassicSimilarity], result of:
            0.08998635 = score(doc=5855,freq=1.0), product of:
              0.15215993 = queryWeight, product of:
                1.0983515 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.014640728 = queryNorm
              0.5913932 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.013995801 = weight(abstract_txt:werden in 5855) [ClassicSimilarity], result of:
            0.013995801 = score(doc=5855,freq=1.0), product of:
              0.063468 = queryWeight, product of:
                1.228653 = boost
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.014640728 = queryNorm
              0.22051744 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.28866088 = weight(abstract_txt:colibri in 5855) [ClassicSimilarity], result of:
            0.28866088 = score(doc=5855,freq=2.0), product of:
              0.33095926 = queryWeight, product of:
                2.2908332 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.014640728 = queryNorm
              0.87219465 = fieldWeight in 5855, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.17733747 = weight(abstract_txt:klassifizierung in 5855) [ClassicSimilarity], result of:
            0.17733747 = score(doc=5855,freq=1.0), product of:
              0.3449504 = queryWeight, product of:
                2.8643768 = boost
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.014640728 = queryNorm
              0.51409554 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.17298058 = weight(abstract_txt:automatischen in 5855) [ClassicSimilarity], result of:
            0.17298058 = score(doc=5855,freq=1.0), product of:
              0.4022575 = queryWeight, product of:
                3.9932663 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.014640728 = queryNorm
              0.4300245 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
        0.2 = coord(5/25)
    
  5. Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.13
    0.1300898 = sum of:
      0.1300898 = product of:
        0.650449 = sum of:
          0.034989502 = weight(abstract_txt:werden in 2590) [ClassicSimilarity], result of:
            0.034989502 = score(doc=2590,freq=4.0), product of:
              0.063468 = queryWeight, product of:
                1.228653 = boost
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.014640728 = queryNorm
              0.5512936 = fieldWeight in 2590, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.528279 = idf(docFreq=3398, maxDocs=42596)
                0.078125 = fieldNorm(doc=2590)
          0.08876516 = weight(abstract_txt:verfahren in 2590) [ClassicSimilarity], result of:
            0.08876516 = score(doc=2590,freq=3.0), product of:
              0.113511674 = queryWeight, product of:
                1.3416106 = boost
                5.7789826 = idf(docFreq=357, maxDocs=42596)
                0.014640728 = queryNorm
              0.7819915 = fieldWeight in 2590, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7789826 = idf(docFreq=357, maxDocs=42596)
                0.078125 = fieldNorm(doc=2590)
          0.088796735 = weight(abstract_txt:automatische in 2590) [ClassicSimilarity], result of:
            0.088796735 = score(doc=2590,freq=1.0), product of:
              0.16375098 = queryWeight, product of:
                1.6113807 = boost
                6.9410167 = idf(docFreq=111, maxDocs=42596)
                0.014640728 = queryNorm
              0.5422669 = fieldWeight in 2590, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9410167 = idf(docFreq=111, maxDocs=42596)
                0.078125 = fieldNorm(doc=2590)
          0.22167183 = weight(abstract_txt:klassifizierung in 2590) [ClassicSimilarity], result of:
            0.22167183 = score(doc=2590,freq=1.0), product of:
              0.3449504 = queryWeight, product of:
                2.8643768 = boost
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.014640728 = queryNorm
              0.64261943 = fieldWeight in 2590, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.078125 = fieldNorm(doc=2590)
          0.21622574 = weight(abstract_txt:automatischen in 2590) [ClassicSimilarity], result of:
            0.21622574 = score(doc=2590,freq=1.0), product of:
              0.4022575 = queryWeight, product of:
                3.9932663 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.014640728 = queryNorm
              0.53753066 = fieldWeight in 2590, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.078125 = fieldNorm(doc=2590)
        0.2 = coord(5/25)