Document (#38276)

Author
Wiesenmüller, H.
Pfeffer, M.
Title
Abgleichen, anreichern, verknüpfen : das Clustering-Verfahren - eine neue Möglichkeit für die Analyse und Verbesserung von Katalogdaten
Source
BuB. 65(2013) H.9, S. 625-629
Year
2013
Series
Lesesaal: Praxis
Abstract
Ein vergleichsweise einfaches Verfah ren bildet die Grundlage: Über einen Abgleich einiger weniger Kategorien lassen sich mit großer Zuverlässigkeit diejenigen bibliografischen Datensätze aus einem Datenpool (der auch aus mehreren Katalogen bestehen kann) zusammenführen, die zum selben Werk gehören. Ein solches Werk-Cluster umfasst dann unterschiedliche Ausgaben und Auflagen eines Werkes ebenso wie Übersetzungen. Zu einem Cluster gehören alle Datensätze, die im Einheitssachtitel beziehungsweise in Sachtitel und Zusätzen übereinstimmen und mindestens eine verknüpfte Person oder Körperschaft gemeinsam haben.
Footnote
Neben den gewohnten Vortragsveranstaltungen in großen Sälen wartete der Leipziger Bibliothekskongress im März 2013 mit einem neuen Veranstaltungsformat auf: Verschiedene Workshops boten die Gelegenheit, Themen intensiv zu beleuchten und in kleinen Gruppen zu diskutieren. Einer dieser Workshops wurde von den Autoren des vorliegenden Beitrags gestaltet und war neuartigen Möglichkeiten für die Analyse und Verbesserung von Katalogdaten gewidmet. Als dritter Referent wurde Markus Geipel von der Deutschen Nationalbibliothek (DNB) über Google Hangout virtuell zugeschaltet. Initiiert wurde die Veranstaltung von der AG Bibliotheken der Deutschen Gesellschaft für Klassifikation, die damit an ihre Hildesheimer Tagung von 2012 anknüpfte' Im Folgenden werden die wichtigsten Ergebnisse zusammengefasst.
Theme
Kataloganreicherung

Similar documents (author)

  1. Pfeffer, M.; Wiesenmüller, H.: Resource Discovery Systeme (2016) 6.03
    6.026801 = sum of:
      6.026801 = sum of:
        2.3754253 = weight(author_txt:wiesenmüller in 6775) [ClassicSimilarity], result of:
          2.3754253 = score(doc=6775,freq=1.0), product of:
            0.60040843 = queryWeight, product of:
              7.912698 = idf(docFreq=43, maxDocs=44218)
              0.075879104 = queryNorm
            3.956349 = fieldWeight in 6775, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.912698 = idf(docFreq=43, maxDocs=44218)
              0.5 = fieldNorm(doc=6775)
        3.6513755 = weight(author_txt:pfeffer in 6775) [ClassicSimilarity], result of:
          3.6513755 = score(doc=6775,freq=1.0), product of:
            0.79969347 = queryWeight, product of:
              1.1540866 = boost
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.075879104 = queryNorm
            4.565969 = fieldWeight in 6775, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.5 = fieldNorm(doc=6775)
    
  2. Wiesenmüller, H.; Maylein, L.; Pfeffer, M.: Mehr aus der Schlagwortnormdatei herausholen : Implementierung einer geographischen Facette in den Online-Katalogen der UB Heidelberg und der UB Mannheim (2011) 4.52
    4.5201006 = sum of:
      4.5201006 = sum of:
        1.781569 = weight(author_txt:wiesenmüller in 2563) [ClassicSimilarity], result of:
          1.781569 = score(doc=2563,freq=1.0), product of:
            0.60040843 = queryWeight, product of:
              7.912698 = idf(docFreq=43, maxDocs=44218)
              0.075879104 = queryNorm
            2.9672618 = fieldWeight in 2563, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.912698 = idf(docFreq=43, maxDocs=44218)
              0.375 = fieldNorm(doc=2563)
        2.7385316 = weight(author_txt:pfeffer in 2563) [ClassicSimilarity], result of:
          2.7385316 = score(doc=2563,freq=1.0), product of:
            0.79969347 = queryWeight, product of:
              1.1540866 = boost
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.075879104 = queryNorm
            3.4244766 = fieldWeight in 2563, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.375 = fieldNorm(doc=2563)
    
  3. Pfeffer, J.: Online-Tutorials an deutschen Universitäts- und Hochschulbibliotheken : Verbreitung, Typologie und Analyse am Beispiel von LOTSE, DISCUS und BibTutor (2005) 2.28
    2.2821097 = sum of:
      2.2821097 = product of:
        4.5642195 = sum of:
          4.5642195 = weight(author_txt:pfeffer in 4837) [ClassicSimilarity], result of:
            4.5642195 = score(doc=4837,freq=1.0), product of:
              0.79969347 = queryWeight, product of:
                1.1540866 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.075879104 = queryNorm
              5.7074614 = fieldWeight in 4837, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.625 = fieldNorm(doc=4837)
        0.5 = coord(1/2)
    
  4. Pfeffer, M.: Automatische Vergabe von RVK-Notationen anhand von bibliografischen Daten mittels fallbasiertem Schließen (2007) 2.28
    2.2821097 = sum of:
      2.2821097 = product of:
        4.5642195 = sum of:
          4.5642195 = weight(author_txt:pfeffer in 558) [ClassicSimilarity], result of:
            4.5642195 = score(doc=558,freq=1.0), product of:
              0.79969347 = queryWeight, product of:
                1.1540866 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.075879104 = queryNorm
              5.7074614 = fieldWeight in 558, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.625 = fieldNorm(doc=558)
        0.5 = coord(1/2)
    
  5. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 2.28
    2.2821097 = sum of:
      2.2821097 = product of:
        4.5642195 = sum of:
          4.5642195 = weight(author_txt:pfeffer in 3051) [ClassicSimilarity], result of:
            4.5642195 = score(doc=3051,freq=1.0), product of:
              0.79969347 = queryWeight, product of:
                1.1540866 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.075879104 = queryNorm
              5.7074614 = fieldWeight in 3051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.625 = fieldNorm(doc=3051)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Schaffner, V.: FRBR in MAB2 und Primo - ein kafkaesker Prozess? : Möglichkeiten der FRBRisierung von MAB2-Datensätzen in Primo exemplarisch dargestellt an Datensätzen zu Franz Kafkas "Der Process" (2011) 0.05
    0.052928813 = sum of:
      0.052928813 = product of:
        0.44107345 = sum of:
          0.07736586 = weight(abstract_txt:übersetzungen in 907) [ClassicSimilarity], result of:
            0.07736586 = score(doc=907,freq=1.0), product of:
              0.17120986 = queryWeight, product of:
                1.0472325 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.019785777 = queryNorm
              0.45187736 = fieldWeight in 907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0546875 = fieldNorm(doc=907)
          0.09194761 = weight(abstract_txt:werkes in 907) [ClassicSimilarity], result of:
            0.09194761 = score(doc=907,freq=1.0), product of:
              0.19209799 = queryWeight, product of:
                1.1092774 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.019785777 = queryNorm
              0.4786495 = fieldWeight in 907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0546875 = fieldNorm(doc=907)
          0.27176 = weight(abstract_txt:datensätze in 907) [ClassicSimilarity], result of:
            0.27176 = score(doc=907,freq=4.0), product of:
              0.31400955 = queryWeight, product of:
                2.0056963 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.019785777 = queryNorm
              0.86545134 = fieldWeight in 907, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.0546875 = fieldNorm(doc=907)
        0.12 = coord(3/25)
    
  2. Unser, M.; Wäckerlin, D.: Dienstleistung "Abstract/Index" im NEBIS-Katalog (2006) 0.05
    0.049099147 = sum of:
      0.049099147 = product of:
        0.3068697 = sum of:
          0.01916669 = weight(abstract_txt:einem in 5030) [ClassicSimilarity], result of:
            0.01916669 = score(doc=5030,freq=1.0), product of:
              0.09429785 = queryWeight, product of:
                1.0991188 = boost
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.019785777 = queryNorm
              0.2032569 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.09451759 = weight(abstract_txt:datenpool in 5030) [ClassicSimilarity], result of:
            0.09451759 = score(doc=5030,freq=1.0), product of:
              0.21683803 = queryWeight, product of:
                1.1785458 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.019785777 = queryNorm
              0.43589026 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.07671682 = weight(abstract_txt:gehören in 5030) [ClassicSimilarity], result of:
            0.07671682 = score(doc=5030,freq=1.0), product of:
              0.23771912 = queryWeight, product of:
                1.7451221 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.019785777 = queryNorm
              0.32272044 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.11646856 = weight(abstract_txt:datensätze in 5030) [ClassicSimilarity], result of:
            0.11646856 = score(doc=5030,freq=1.0), product of:
              0.31400955 = queryWeight, product of:
                2.0056963 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.019785777 = queryNorm
              0.37090772 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
        0.16 = coord(4/25)
    
  3. Sen, W.: Social Media Measurement : Methoden zur automatischen Reichweitenmessung von Beiträgen in Webforen (2013) 0.05
    0.048471518 = sum of:
      0.048471518 = product of:
        0.30294698 = sum of:
          0.07698603 = weight(abstract_txt:solches in 993) [ClassicSimilarity], result of:
            0.07698603 = score(doc=993,freq=1.0), product of:
              0.15611424 = queryWeight, product of:
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.019785777 = queryNorm
              0.49313906 = fieldWeight in 993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0625 = fieldNorm(doc=993)
          0.08642623 = weight(abstract_txt:diejenigen in 993) [ClassicSimilarity], result of:
            0.08642623 = score(doc=993,freq=1.0), product of:
              0.16862874 = queryWeight, product of:
                1.0393087 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.019785777 = queryNorm
              0.5125237 = fieldWeight in 993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=993)
          0.036141057 = weight(abstract_txt:einem in 993) [ClassicSimilarity], result of:
            0.036141057 = score(doc=993,freq=2.0), product of:
              0.09429785 = queryWeight, product of:
                1.0991188 = boost
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.019785777 = queryNorm
              0.3832649 = fieldWeight in 993, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.0625 = fieldNorm(doc=993)
          0.10339367 = weight(abstract_txt:werk in 993) [ClassicSimilarity], result of:
            0.10339367 = score(doc=993,freq=1.0), product of:
              0.2394274 = queryWeight, product of:
                1.7513812 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.019785777 = queryNorm
              0.43183723 = fieldWeight in 993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0625 = fieldNorm(doc=993)
        0.16 = coord(4/25)
    
  4. ¬Der Digitale Peters : Arno Peters' synchronoptische Weltgeschichte (2010) 0.05
    0.046599768 = sum of:
      0.046599768 = product of:
        0.3883314 = sum of:
          0.03833338 = weight(abstract_txt:einem in 4783) [ClassicSimilarity], result of:
            0.03833338 = score(doc=4783,freq=1.0), product of:
              0.09429785 = queryWeight, product of:
                1.0991188 = boost
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.019785777 = queryNorm
              0.4065138 = fieldWeight in 4783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.09375 = fieldNorm(doc=4783)
          0.19490752 = weight(abstract_txt:auflagen in 4783) [ClassicSimilarity], result of:
            0.19490752 = score(doc=4783,freq=1.0), product of:
              0.22130579 = queryWeight, product of:
                1.1906254 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.019785777 = queryNorm
              0.88071585 = fieldWeight in 4783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.09375 = fieldNorm(doc=4783)
          0.15509051 = weight(abstract_txt:werk in 4783) [ClassicSimilarity], result of:
            0.15509051 = score(doc=4783,freq=1.0), product of:
              0.2394274 = queryWeight, product of:
                1.7513812 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.019785777 = queryNorm
              0.64775586 = fieldWeight in 4783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.09375 = fieldNorm(doc=4783)
        0.12 = coord(3/25)
    
  5. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.05
    0.04629202 = sum of:
      0.04629202 = product of:
        0.28932512 = sum of:
          0.068046674 = weight(abstract_txt:bibliografischen in 1952) [ClassicSimilarity], result of:
            0.068046674 = score(doc=1952,freq=2.0), product of:
              0.15611424 = queryWeight, product of:
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.019785777 = queryNorm
              0.43587744 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.068046674 = weight(abstract_txt:ausgaben in 1952) [ClassicSimilarity], result of:
            0.068046674 = score(doc=1952,freq=2.0), product of:
              0.15611424 = queryWeight, product of:
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.019785777 = queryNorm
              0.43587744 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.01597224 = weight(abstract_txt:einem in 1952) [ClassicSimilarity], result of:
            0.01597224 = score(doc=1952,freq=1.0), product of:
              0.09429785 = queryWeight, product of:
                1.0991188 = boost
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.019785777 = queryNorm
              0.16938075 = fieldWeight in 1952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3361473 = idf(docFreq=1572, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.13725953 = weight(abstract_txt:datensätze in 1952) [ClassicSimilarity], result of:
            0.13725953 = score(doc=1952,freq=2.0), product of:
              0.31400955 = queryWeight, product of:
                2.0056963 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.019785777 = queryNorm
              0.43711895 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
        0.16 = coord(4/25)