Document (#34676)

Author
Reiner, U.
Title
VZG-Projekt Colibri : Bewertung von automatisch DDC-klassifizierten Titeldatensätzen der Deutschen Nationalbibliothek (DNB)
Issue
August 2008 - Februar 2009.
Imprint
Göttingen : Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG)
Year
2009
Pages
111 S
Series
VZG-Colibri-Bericht 1/2008
Abstract
Das VZG-Projekt Colibri/DDC beschäftigt sich seit 2003 mit automatischen Verfahren zur Dewey-Dezimalklassifikation (Dewey Decimal Classification, kurz DDC). Ziel des Projektes ist eine einheitliche DDC-Erschließung von bibliografischen Titeldatensätzen und eine Unterstützung der DDC-Expert(inn)en und DDC-Laien, z. B. bei der Analyse und Synthese von DDC-Notationen und deren Qualitätskontrolle und der DDC-basierten Suche. Der vorliegende Bericht konzentriert sich auf die erste größere automatische DDC-Klassifizierung und erste automatische und intellektuelle Bewertung mit der Klassifizierungskomponente vc_dcl1. Grundlage hierfür waren die von der Deutschen Nationabibliothek (DNB) im November 2007 zur Verfügung gestellten 25.653 Titeldatensätze (12 Wochen-/Monatslieferungen) der Deutschen Nationalbibliografie der Reihen A, B und H. Nach Erläuterung der automatischen DDC-Klassifizierung und automatischen Bewertung in Kapitel 2 wird in Kapitel 3 auf den DNB-Bericht "Colibri_Auswertung_DDC_Endbericht_Sommer_2008" eingegangen. Es werden Sachverhalte geklärt und Fragen gestellt, deren Antworten die Weichen für den Verlauf der weiteren Klassifizierungstests stellen werden. Über das Kapitel 3 hinaus führende weitergehende Betrachtungen und Gedanken zur Fortführung der automatischen DDC-Klassifizierung werden in Kapitel 4 angestellt. Der Bericht dient dem vertieften Verständnis für die automatischen Verfahren.
Content
Vgl. unter; http://taipan.dyndns.org/~ul/colibri05.pdf.
Theme
Automatisches Klassifizieren
Object
Colibri
DDC

Similar documents (author)

  1. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 5.71
    5.7074614 = sum of:
      5.7074614 = weight(author_txt:reiner in 611) [ClassicSimilarity], result of:
        5.7074614 = fieldWeight in 611, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.625 = fieldNorm(doc=611)
    
  2. Reiner, U.: Anfragesprachen für Informationssysteme (1991) 5.71
    5.7074614 = sum of:
      5.7074614 = weight(author_txt:reiner in 4553) [ClassicSimilarity], result of:
        5.7074614 = fieldWeight in 4553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.625 = fieldNorm(doc=4553)
    
  3. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 5.71
    5.7074614 = sum of:
      5.7074614 = weight(author_txt:reiner in 4854) [ClassicSimilarity], result of:
        5.7074614 = fieldWeight in 4854, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.625 = fieldNorm(doc=4854)
    
  4. Reiner, U.: Automatic analysis of DDC notations (2007) 5.71
    5.7074614 = sum of:
      5.7074614 = weight(author_txt:reiner in 118) [ClassicSimilarity], result of:
        5.7074614 = fieldWeight in 118, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.625 = fieldNorm(doc=118)
    
  5. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 5.71
    5.7074614 = sum of:
      5.7074614 = weight(author_txt:reiner in 2166) [ClassicSimilarity], result of:
        5.7074614 = fieldWeight in 2166, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.131938 = idf(docFreq=12, maxDocs=44218)
          0.625 = fieldNorm(doc=2166)
    

Similar documents (content)

  1. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.48
    0.48239604 = sum of:
      0.48239604 = product of:
        1.2059901 = sum of:
          0.06853035 = weight(abstract_txt:titeldatensätze in 3284) [ClassicSimilarity], result of:
            0.06853035 = score(doc=3284,freq=1.0), product of:
              0.1538982 = queryWeight, product of:
                1.1038617 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.014676101 = queryNorm
              0.44529667 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.023115084 = weight(abstract_txt:werden in 3284) [ClassicSimilarity], result of:
            0.023115084 = score(doc=3284,freq=5.0), product of:
              0.06289637 = queryWeight, product of:
                1.2222817 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014676101 = queryNorm
              0.36751062 = fieldWeight in 3284, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.030500911 = weight(abstract_txt:projekt in 3284) [ClassicSimilarity], result of:
            0.030500911 = score(doc=3284,freq=1.0), product of:
              0.113030784 = queryWeight, product of:
                1.3378619 = boost
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.014676101 = queryNorm
              0.26984605 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.756716 = idf(docFreq=379, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.061169736 = weight(abstract_txt:verfahren in 3284) [ClassicSimilarity], result of:
            0.061169736 = score(doc=3284,freq=4.0), product of:
              0.11323811 = queryWeight, product of:
                1.3390883 = boost
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.014676101 = queryNorm
              0.5401868 = fieldWeight in 3284, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.03206272 = weight(abstract_txt:dewey in 3284) [ClassicSimilarity], result of:
            0.03206272 = score(doc=3284,freq=1.0), product of:
              0.116857104 = queryWeight, product of:
                1.3603181 = boost
                5.853343 = idf(docFreq=344, maxDocs=44218)
                0.014676101 = queryNorm
              0.27437544 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.853343 = idf(docFreq=344, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.032460846 = weight(abstract_txt:deutschen in 3284) [ClassicSimilarity], result of:
            0.032460846 = score(doc=3284,freq=1.0), product of:
              0.13487305 = queryWeight, product of:
                1.7898685 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.014676101 = queryNorm
              0.24067703 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.20962736 = weight(abstract_txt:titeldatensätzen in 3284) [ClassicSimilarity], result of:
            0.20962736 = score(doc=3284,freq=2.0), product of:
              0.32429743 = queryWeight, product of:
                2.2661293 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.014676101 = queryNorm
              0.6464046 = fieldWeight in 3284, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.15537053 = weight(abstract_txt:colibri in 3284) [ClassicSimilarity], result of:
            0.15537053 = score(doc=3284,freq=1.0), product of:
              0.3346319 = queryWeight, product of:
                2.3019536 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.014676101 = queryNorm
              0.46430284 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.3025257 = weight(abstract_txt:klassifizierung in 3284) [ClassicSimilarity], result of:
            0.3025257 = score(doc=3284,freq=5.0), product of:
              0.34930393 = queryWeight, product of:
                2.8804495 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.014676101 = queryNorm
              0.8660816 = fieldWeight in 3284, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.29062688 = weight(abstract_txt:automatischen in 3284) [ClassicSimilarity], result of:
            0.29062688 = score(doc=3284,freq=5.0), product of:
              0.4032138 = queryWeight, product of:
                3.995311 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              0.72077614 = fieldWeight in 3284, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
        0.4 = coord(10/25)
    
  2. Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.24
    0.23847233 = sum of:
      0.23847233 = product of:
        1.1923616 = sum of:
          0.15990415 = weight(abstract_txt:titeldatensätze in 122) [ClassicSimilarity], result of:
            0.15990415 = score(doc=122,freq=1.0), product of:
              0.1538982 = queryWeight, product of:
                1.1038617 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.014676101 = queryNorm
              1.0390255 = fieldWeight in 122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.109375 = fieldNorm(doc=122)
          0.07574197 = weight(abstract_txt:deutschen in 122) [ClassicSimilarity], result of:
            0.07574197 = score(doc=122,freq=1.0), product of:
              0.13487305 = queryWeight, product of:
                1.7898685 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.014676101 = queryNorm
              0.5615797 = fieldWeight in 122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.109375 = fieldNorm(doc=122)
          0.34586748 = weight(abstract_txt:titeldatensätzen in 122) [ClassicSimilarity], result of:
            0.34586748 = score(doc=122,freq=1.0), product of:
              0.32429743 = queryWeight, product of:
                2.2661293 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.014676101 = queryNorm
              1.0665132 = fieldWeight in 122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.109375 = fieldNorm(doc=122)
          0.18196122 = weight(abstract_txt:bewertung in 122) [ClassicSimilarity], result of:
            0.18196122 = score(doc=122,freq=1.0), product of:
              0.24192831 = queryWeight, product of:
                2.3971868 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              0.7521287 = fieldWeight in 122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.109375 = fieldNorm(doc=122)
          0.42888668 = weight(abstract_txt:automatischen in 122) [ClassicSimilarity], result of:
            0.42888668 = score(doc=122,freq=2.0), product of:
              0.4032138 = queryWeight, product of:
                3.995311 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              1.0636706 = fieldWeight in 122, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.109375 = fieldNorm(doc=122)
        0.2 = coord(5/25)
    
  3. Balakrishnan, U.; Krausz, A,; Voss, J.: Cocoda - ein Konkordanztool für bibliothekarische Klassifikationssysteme (2015) 0.16
    0.15927427 = sum of:
      0.15927427 = product of:
        0.7963713 = sum of:
          0.023873154 = weight(abstract_txt:werden in 2030) [ClassicSimilarity], result of:
            0.023873154 = score(doc=2030,freq=3.0), product of:
              0.06289637 = queryWeight, product of:
                1.2222817 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014676101 = queryNorm
              0.3795633 = fieldWeight in 2030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=2030)
          0.042750295 = weight(abstract_txt:dewey in 2030) [ClassicSimilarity], result of:
            0.042750295 = score(doc=2030,freq=1.0), product of:
              0.116857104 = queryWeight, product of:
                1.3603181 = boost
                5.853343 = idf(docFreq=344, maxDocs=44218)
                0.014676101 = queryNorm
              0.36583394 = fieldWeight in 2030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.853343 = idf(docFreq=344, maxDocs=44218)
                0.0625 = fieldNorm(doc=2030)
          0.19763856 = weight(abstract_txt:titeldatensätzen in 2030) [ClassicSimilarity], result of:
            0.19763856 = score(doc=2030,freq=1.0), product of:
              0.32429743 = queryWeight, product of:
                2.2661293 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.014676101 = queryNorm
              0.6094361 = fieldWeight in 2030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0625 = fieldNorm(doc=2030)
          0.35881287 = weight(abstract_txt:colibri in 2030) [ClassicSimilarity], result of:
            0.35881287 = score(doc=2030,freq=3.0), product of:
              0.3346319 = queryWeight, product of:
                2.3019536 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.014676101 = queryNorm
              1.0722615 = fieldWeight in 2030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.0625 = fieldNorm(doc=2030)
          0.17329639 = weight(abstract_txt:automatischen in 2030) [ClassicSimilarity], result of:
            0.17329639 = score(doc=2030,freq=1.0), product of:
              0.4032138 = queryWeight, product of:
                3.995311 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              0.42978784 = fieldWeight in 2030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=2030)
        0.2 = coord(5/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15036288 = sum of:
      0.15036288 = product of:
        0.75181437 = sum of:
          0.09137381 = weight(abstract_txt:titeldatensätze in 4854) [ClassicSimilarity], result of:
            0.09137381 = score(doc=4854,freq=1.0), product of:
              0.1538982 = queryWeight, product of:
                1.1038617 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.014676101 = queryNorm
              0.5937289 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.013783172 = weight(abstract_txt:werden in 4854) [ClassicSimilarity], result of:
            0.013783172 = score(doc=4854,freq=1.0), product of:
              0.06289637 = queryWeight, product of:
                1.2222817 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014676101 = queryNorm
              0.21914098 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.2929695 = weight(abstract_txt:colibri in 4854) [ClassicSimilarity], result of:
            0.2929695 = score(doc=4854,freq=2.0), product of:
              0.3346319 = queryWeight, product of:
                2.3019536 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.014676101 = queryNorm
              0.8754978 = fieldWeight in 4854, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.18039148 = weight(abstract_txt:klassifizierung in 4854) [ClassicSimilarity], result of:
            0.18039148 = score(doc=4854,freq=1.0), product of:
              0.34930393 = queryWeight, product of:
                2.8804495 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.014676101 = queryNorm
              0.5164313 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.17329639 = weight(abstract_txt:automatischen in 4854) [ClassicSimilarity], result of:
            0.17329639 = score(doc=4854,freq=1.0), product of:
              0.4032138 = queryWeight, product of:
                3.995311 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              0.42978784 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
        0.2 = coord(5/25)
    
  5. Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.13
    0.1306144 = sum of:
      0.1306144 = product of:
        0.653072 = sum of:
          0.034457933 = weight(abstract_txt:werden in 1410) [ClassicSimilarity], result of:
            0.034457933 = score(doc=1410,freq=4.0), product of:
              0.06289637 = queryWeight, product of:
                1.2222817 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014676101 = queryNorm
              0.54785246 = fieldWeight in 1410, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.078125 = fieldNorm(doc=1410)
          0.08829091 = weight(abstract_txt:verfahren in 1410) [ClassicSimilarity], result of:
            0.08829091 = score(doc=1410,freq=3.0), product of:
              0.11323811 = queryWeight, product of:
                1.3390883 = boost
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.014676101 = queryNorm
              0.77969253 = fieldWeight in 1410, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.078125 = fieldNorm(doc=1410)
          0.08821336 = weight(abstract_txt:automatische in 1410) [ClassicSimilarity], result of:
            0.08821336 = score(doc=1410,freq=1.0), product of:
              0.16322197 = queryWeight, product of:
                1.6076897 = boost
                6.9177637 = idf(docFreq=118, maxDocs=44218)
                0.014676101 = queryNorm
              0.5404503 = fieldWeight in 1410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9177637 = idf(docFreq=118, maxDocs=44218)
                0.078125 = fieldNorm(doc=1410)
          0.22548935 = weight(abstract_txt:klassifizierung in 1410) [ClassicSimilarity], result of:
            0.22548935 = score(doc=1410,freq=1.0), product of:
              0.34930393 = queryWeight, product of:
                2.8804495 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.014676101 = queryNorm
              0.6455391 = fieldWeight in 1410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.078125 = fieldNorm(doc=1410)
          0.21662048 = weight(abstract_txt:automatischen in 1410) [ClassicSimilarity], result of:
            0.21662048 = score(doc=1410,freq=1.0), product of:
              0.4032138 = queryWeight, product of:
                3.995311 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014676101 = queryNorm
              0.5372348 = fieldWeight in 1410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.078125 = fieldNorm(doc=1410)
        0.2 = coord(5/25)