Document (#38277)

Author
Wiesenmüller, H.
Pfeffer, M.
Title
Abgleichen, anreichern, verknüpfen : das Clustering-Verfahren - eine neue Möglichkeit für die Analyse und Verbesserung von Katalogdaten
Source
BuB. 65(2013) H.9, S. 625-629
Year
2013
Series
Lesesaal: Praxis
Abstract
Ein vergleichsweise einfaches Verfah ren bildet die Grundlage: Über einen Abgleich einiger weniger Kategorien lassen sich mit großer Zuverlässigkeit diejenigen bibliografischen Datensätze aus einem Datenpool (der auch aus mehreren Katalogen bestehen kann) zusammenführen, die zum selben Werk gehören. Ein solches Werk-Cluster umfasst dann unterschiedliche Ausgaben und Auflagen eines Werkes ebenso wie Übersetzungen. Zu einem Cluster gehören alle Datensätze, die im Einheitssachtitel beziehungsweise in Sachtitel und Zusätzen übereinstimmen und mindestens eine verknüpfte Person oder Körperschaft gemeinsam haben.
Footnote
Neben den gewohnten Vortragsveranstaltungen in großen Sälen wartete der Leipziger Bibliothekskongress im März 2013 mit einem neuen Veranstaltungsformat auf: Verschiedene Workshops boten die Gelegenheit, Themen intensiv zu beleuchten und in kleinen Gruppen zu diskutieren. Einer dieser Workshops wurde von den Autoren des vorliegenden Beitrags gestaltet und war neuartigen Möglichkeiten für die Analyse und Verbesserung von Katalogdaten gewidmet. Als dritter Referent wurde Markus Geipel von der Deutschen Nationalbibliothek (DNB) über Google Hangout virtuell zugeschaltet. Initiiert wurde die Veranstaltung von der AG Bibliotheken der Deutschen Gesellschaft für Klassifikation, die damit an ihre Hildesheimer Tagung von 2012 anknüpfte' Im Folgenden werden die wichtigsten Ergebnisse zusammengefasst.
Theme
Kataloganreicherung

Similar documents (author)

  1. Pfeffer, M.; Wiesenmüller, H.: Resource Discovery Systeme (2016) 6.04
    6.0379844 = sum of:
      6.0379844 = sum of:
        2.4426777 = weight(author_txt:wiesenmüller in 6844) [ClassicSimilarity], result of:
          2.4426777 = score(doc=6844,freq=1.0), product of:
            0.6115009 = queryWeight, product of:
              7.9891224 = idf(docFreq=38, maxDocs=42306)
              0.07654169 = queryNorm
            3.9945612 = fieldWeight in 6844, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.9891224 = idf(docFreq=38, maxDocs=42306)
              0.5 = fieldNorm(doc=6844)
        3.5953066 = weight(author_txt:pfeffer in 6844) [ClassicSimilarity], result of:
          3.5953066 = score(doc=6844,freq=1.0), product of:
            0.79124373 = queryWeight, product of:
              1.1375135 = boost
              9.087735 = idf(docFreq=12, maxDocs=42306)
              0.07654169 = queryNorm
            4.5438676 = fieldWeight in 6844, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.087735 = idf(docFreq=12, maxDocs=42306)
              0.5 = fieldNorm(doc=6844)
    
  2. Wiesenmüller, H.; Maylein, L.; Pfeffer, M.: Mehr aus der Schlagwortnormdatei herausholen : Implementierung einer geographischen Facette in den Online-Katalogen der UB Heidelberg und der UB Mannheim (2011) 4.53
    4.5284886 = sum of:
      4.5284886 = sum of:
        1.8320084 = weight(author_txt:wiesenmüller in 383) [ClassicSimilarity], result of:
          1.8320084 = score(doc=383,freq=1.0), product of:
            0.6115009 = queryWeight, product of:
              7.9891224 = idf(docFreq=38, maxDocs=42306)
              0.07654169 = queryNorm
            2.995921 = fieldWeight in 383, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.9891224 = idf(docFreq=38, maxDocs=42306)
              0.375 = fieldNorm(doc=383)
        2.6964803 = weight(author_txt:pfeffer in 383) [ClassicSimilarity], result of:
          2.6964803 = score(doc=383,freq=1.0), product of:
            0.79124373 = queryWeight, product of:
              1.1375135 = boost
              9.087735 = idf(docFreq=12, maxDocs=42306)
              0.07654169 = queryNorm
            3.4079008 = fieldWeight in 383, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.087735 = idf(docFreq=12, maxDocs=42306)
              0.375 = fieldNorm(doc=383)
    
  3. Pfeffer, J.: Online-Tutorials an deutschen Universitäts- und Hochschulbibliotheken : Verbreitung, Typologie und Analyse am Beispiel von LOTSE, DISCUS und BibTutor (2005) 2.25
    2.2470667 = sum of:
      2.2470667 = product of:
        4.4941335 = sum of:
          4.4941335 = weight(author_txt:pfeffer in 838) [ClassicSimilarity], result of:
            4.4941335 = score(doc=838,freq=1.0), product of:
              0.79124373 = queryWeight, product of:
                1.1375135 = boost
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.07654169 = queryNorm
              5.6798344 = fieldWeight in 838, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.625 = fieldNorm(doc=838)
        0.5 = coord(1/2)
    
  4. Pfeffer, M.: Automatische Vergabe von RVK-Notationen anhand von bibliografischen Daten mittels fallbasiertem Schließen (2007) 2.25
    2.2470667 = sum of:
      2.2470667 = product of:
        4.4941335 = sum of:
          4.4941335 = weight(author_txt:pfeffer in 2559) [ClassicSimilarity], result of:
            4.4941335 = score(doc=2559,freq=1.0), product of:
              0.79124373 = queryWeight, product of:
                1.1375135 = boost
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.07654169 = queryNorm
              5.6798344 = fieldWeight in 2559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.625 = fieldNorm(doc=2559)
        0.5 = coord(1/2)
    
  5. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 2.25
    2.2470667 = sum of:
      2.2470667 = product of:
        4.4941335 = sum of:
          4.4941335 = weight(author_txt:pfeffer in 52) [ClassicSimilarity], result of:
            4.4941335 = score(doc=52,freq=1.0), product of:
              0.79124373 = queryWeight, product of:
                1.1375135 = boost
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.07654169 = queryNorm
              5.6798344 = fieldWeight in 52, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.087735 = idf(docFreq=12, maxDocs=42306)
                0.625 = fieldNorm(doc=52)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Schaffner, V.: FRBR in MAB2 und Primo - ein kafkaesker Prozess? : Möglichkeiten der FRBRisierung von MAB2-Datensätzen in Primo exemplarisch dargestellt an Datensätzen zu Franz Kafkas "Der Process" (2011) 0.05
    0.052933607 = sum of:
      0.052933607 = product of:
        0.4411134 = sum of:
          0.07658012 = weight(abstract_txt:übersetzungen in 2908) [ClassicSimilarity], result of:
            0.07658012 = score(doc=2908,freq=1.0), product of:
              0.16970545 = queryWeight, product of:
                1.0516778 = boost
                8.251487 = idf(docFreq=29, maxDocs=42306)
                0.019556036 = queryNorm
              0.45125318 = fieldWeight in 2908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.251487 = idf(docFreq=29, maxDocs=42306)
                0.0546875 = fieldNorm(doc=2908)
          0.09170134 = weight(abstract_txt:werkes in 2908) [ClassicSimilarity], result of:
            0.09170134 = score(doc=2908,freq=1.0), product of:
              0.1913678 = queryWeight, product of:
                1.1167842 = boost
                8.762313 = idf(docFreq=17, maxDocs=42306)
                0.019556036 = queryNorm
              0.47918898 = fieldWeight in 2908, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.762313 = idf(docFreq=17, maxDocs=42306)
                0.0546875 = fieldNorm(doc=2908)
          0.27283195 = weight(abstract_txt:datensätze in 2908) [ClassicSimilarity], result of:
            0.27283195 = score(doc=2908,freq=4.0), product of:
              0.3141993 = queryWeight, product of:
                2.0237293 = boost
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.019556036 = queryNorm
              0.8683404 = fieldWeight in 2908, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.0546875 = fieldNorm(doc=2908)
        0.12 = coord(3/25)
    
  2. Unser, M.; Wäckerlin, D.: Dienstleistung "Abstract/Index" im NEBIS-Katalog (2006) 0.05
    0.04876016 = sum of:
      0.04876016 = product of:
        0.304751 = sum of:
          0.019326363 = weight(abstract_txt:einem in 31) [ClassicSimilarity], result of:
            0.019326363 = score(doc=31,freq=1.0), product of:
              0.09462945 = queryWeight, product of:
                1.1106136 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.019556036 = queryNorm
              0.204232 = fieldWeight in 31, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.046875 = fieldNorm(doc=31)
          0.09261306 = weight(abstract_txt:datenpool in 31) [ClassicSimilarity], result of:
            0.09261306 = score(doc=31,freq=1.0), product of:
              0.21348354 = queryWeight, product of:
                1.1795518 = boost
                9.254789 = idf(docFreq=10, maxDocs=42306)
                0.019556036 = queryNorm
              0.43381825 = fieldWeight in 31, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.254789 = idf(docFreq=10, maxDocs=42306)
                0.046875 = fieldNorm(doc=31)
          0.075883605 = weight(abstract_txt:gehören in 31) [ClassicSimilarity], result of:
            0.075883605 = score(doc=31,freq=1.0), product of:
              0.23551844 = queryWeight, product of:
                1.7521137 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.019556036 = queryNorm
              0.32219815 = fieldWeight in 31, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.046875 = fieldNorm(doc=31)
          0.11692798 = weight(abstract_txt:datensätze in 31) [ClassicSimilarity], result of:
            0.11692798 = score(doc=31,freq=1.0), product of:
              0.3141993 = queryWeight, product of:
                2.0237293 = boost
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.019556036 = queryNorm
              0.3721459 = fieldWeight in 31, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.046875 = fieldNorm(doc=31)
        0.16 = coord(4/25)
    
  3. Sen, W.: Social Media Measurement : Methoden zur automatischen Reichweitenmessung von Beiträgen in Webforen (2013) 0.05
    0.04833532 = sum of:
      0.04833532 = product of:
        0.30209577 = sum of:
          0.07724431 = weight(abstract_txt:solches in 2994) [ClassicSimilarity], result of:
            0.07724431 = score(doc=2994,freq=1.0), product of:
              0.1561474 = queryWeight, product of:
                1.0087934 = boost
                7.9150147 = idf(docFreq=41, maxDocs=42306)
                0.019556036 = queryNorm
              0.49468842 = fieldWeight in 2994, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9150147 = idf(docFreq=41, maxDocs=42306)
                0.0625 = fieldNorm(doc=2994)
          0.08648089 = weight(abstract_txt:diejenigen in 2994) [ClassicSimilarity], result of:
            0.08648089 = score(doc=2994,freq=1.0), product of:
              0.16835934 = queryWeight, product of:
                1.0474986 = boost
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.019556036 = queryNorm
              0.51366854 = fieldWeight in 2994, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.218697 = idf(docFreq=30, maxDocs=42306)
                0.0625 = fieldNorm(doc=2994)
          0.03644214 = weight(abstract_txt:einem in 2994) [ClassicSimilarity], result of:
            0.03644214 = score(doc=2994,freq=2.0), product of:
              0.09462945 = queryWeight, product of:
                1.1106136 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.019556036 = queryNorm
              0.38510355 = fieldWeight in 2994, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.0625 = fieldNorm(doc=2994)
          0.10192846 = weight(abstract_txt:werk in 2994) [ClassicSimilarity], result of:
            0.10192846 = score(doc=2994,freq=1.0), product of:
              0.23668137 = queryWeight, product of:
                1.7564341 = boost
                6.89051 = idf(docFreq=116, maxDocs=42306)
                0.019556036 = queryNorm
              0.43065688 = fieldWeight in 2994, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.89051 = idf(docFreq=116, maxDocs=42306)
                0.0625 = fieldNorm(doc=2994)
        0.16 = coord(4/25)
    
  4. ¬Der Digitale Peters : Arno Peters' synchronoptische Weltgeschichte (2010) 0.05
    0.047586914 = sum of:
      0.047586914 = product of:
        0.39655763 = sum of:
          0.038652726 = weight(abstract_txt:einem in 1784) [ClassicSimilarity], result of:
            0.038652726 = score(doc=1784,freq=1.0), product of:
              0.09462945 = queryWeight, product of:
                1.1106136 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.019556036 = queryNorm
              0.408464 = fieldWeight in 1784, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.09375 = fieldNorm(doc=1784)
          0.20501222 = weight(abstract_txt:auflagen in 1784) [ClassicSimilarity], result of:
            0.20501222 = score(doc=1784,freq=1.0), product of:
              0.22842804 = queryWeight, product of:
                1.2201396 = boost
                9.573242 = idf(docFreq=7, maxDocs=42306)
                0.019556036 = queryNorm
              0.89749146 = fieldWeight in 1784, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.573242 = idf(docFreq=7, maxDocs=42306)
                0.09375 = fieldNorm(doc=1784)
          0.1528927 = weight(abstract_txt:werk in 1784) [ClassicSimilarity], result of:
            0.1528927 = score(doc=1784,freq=1.0), product of:
              0.23668137 = queryWeight, product of:
                1.7564341 = boost
                6.89051 = idf(docFreq=116, maxDocs=42306)
                0.019556036 = queryNorm
              0.6459853 = fieldWeight in 1784, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.89051 = idf(docFreq=116, maxDocs=42306)
                0.09375 = fieldNorm(doc=1784)
        0.12 = coord(3/25)
    
  5. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.05
    0.04628989 = sum of:
      0.04628989 = product of:
        0.28931183 = sum of:
          0.06650509 = weight(abstract_txt:bibliografischen in 2953) [ClassicSimilarity], result of:
            0.06650509 = score(doc=2953,freq=2.0), product of:
              0.15343708 = queryWeight, product of:
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.019556036 = queryNorm
              0.43343556 = fieldWeight in 2953, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.0390625 = fieldNorm(doc=2953)
          0.06890047 = weight(abstract_txt:ausgaben in 2953) [ClassicSimilarity], result of:
            0.06890047 = score(doc=2953,freq=2.0), product of:
              0.15709965 = queryWeight, product of:
                1.0118647 = boost
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.019556036 = queryNorm
              0.43857813 = fieldWeight in 2953, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.0390625 = fieldNorm(doc=2953)
          0.016105302 = weight(abstract_txt:einem in 2953) [ClassicSimilarity], result of:
            0.016105302 = score(doc=2953,freq=1.0), product of:
              0.09462945 = queryWeight, product of:
                1.1106136 = boost
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.019556036 = queryNorm
              0.17019333 = fieldWeight in 2953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3569493 = idf(docFreq=1473, maxDocs=42306)
                0.0390625 = fieldNorm(doc=2953)
          0.13780095 = weight(abstract_txt:datensätze in 2953) [ClassicSimilarity], result of:
            0.13780095 = score(doc=2953,freq=2.0), product of:
              0.3141993 = queryWeight, product of:
                2.0237293 = boost
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.019556036 = queryNorm
              0.43857813 = fieldWeight in 2953, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.939112 = idf(docFreq=40, maxDocs=42306)
                0.0390625 = fieldNorm(doc=2953)
        0.16 = coord(4/25)