Document (#43369)

Author
Steeg, F.
Pohl, A.
Title
¬Ein Protokoll für den Datenabgleich im Web am Beispiel von OpenRefine und der Gemeinsamen Normdatei (GND)
Source
Qualität in der Inhaltserschließung. Hrsg.: M. Franke-Maier, u.a
Imprint
München : DeGruyter-Saur
Year
2021
Pages
S.259-278
Series
Bibliotheks- und Informationspraxis; 70
Abstract
Normdaten spielen speziell im Hinblick auf die Qualität der Inhaltserschließung bibliografischer und archivalischer Ressourcen eine wichtige Rolle. Ein konkretes Ziel der Inhaltserschließung ist z. B., dass alle Werke über Hermann Hesse einheitlich zu finden sind. Hier bieten Normdaten eine Lösung, indem z. B. bei der Erschließung einheitlich die GND-Nummer 11855042X für Hermann Hesse verwendet wird. Das Ergebnis ist eine höhere Qualität der Inhaltserschließung vor allem im Sinne von Einheitlichkeit und Eindeutigkeit und, daraus resultierend, eine bessere Auffindbarkeit. Werden solche Entitäten miteinander verknüpft, z. B. Hermann Hesse mit einem seiner Werke, entsteht ein Knowledge Graph, wie ihn etwa Google bei der Inhaltserschließung des Web verwendet (Singhal 2012). Die Entwicklung des Google Knowledge Graph und das hier vorgestellte Protokoll sind historisch miteinander verbunden: OpenRefine wurde ursprünglich als Google Refine entwickelt, und die Funktionalität zum Abgleich mit externen Datenquellen (Reconciliation) wurde ursprünglich zur Einbindung von Freebase entwickelt, einer der Datenquellen des Google Knowledge Graph. Freebase wurde später in Wikidata integriert. Schon Google Refine wurde zum Abgleich mit Normdaten verwendet, etwa den Library of Congress Subject Headings (Hooland et al. 2013).
Theme
Normdateien
Semantische Interoperabilität
Object
OpenRefine
GND
Google Knowledge Graph

Similar documents (author)

  1. Pohl, M.: Hypertext und analoge Wissensrepräsentation : Wie Texte zu Bildern und Bilder zu Texten werden (2003) 5.66
    5.661144 = sum of:
      5.661144 = weight(author_txt:pohl in 3855) [ClassicSimilarity], result of:
        5.661144 = fieldWeight in 3855, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.05783 = idf(docFreq=13, maxDocs=44218)
          0.625 = fieldNorm(doc=3855)
    
  2. Pohl, A.: OCLC, WorldCat und die Metadaten-Kontroverse (2009) 5.66
    5.661144 = sum of:
      5.661144 = weight(author_txt:pohl in 2780) [ClassicSimilarity], result of:
        5.661144 = fieldWeight in 2780, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.05783 = idf(docFreq=13, maxDocs=44218)
          0.625 = fieldNorm(doc=2780)
    
  3. Pohl, O.: Konzept und prototypische Erstellung eines Informationssystems auf VuFind-Basis für die Bibliotheks- und Informationswissenschaft (2012) 5.66
    5.661144 = sum of:
      5.661144 = weight(author_txt:pohl in 1564) [ClassicSimilarity], result of:
        5.661144 = fieldWeight in 1564, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.05783 = idf(docFreq=13, maxDocs=44218)
          0.625 = fieldNorm(doc=1564)
    
  4. Pohl, O.: rdfedit: user supporting Web application for creating and manipulating RDF instance data (2014) 5.66
    5.661144 = sum of:
      5.661144 = weight(author_txt:pohl in 1571) [ClassicSimilarity], result of:
        5.661144 = fieldWeight in 1571, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.05783 = idf(docFreq=13, maxDocs=44218)
          0.625 = fieldNorm(doc=1571)
    
  5. Pohl, A.: Mit der DFG und CIB nach WorldShare und Alma (2013) 5.66
    5.661144 = sum of:
      5.661144 = weight(author_txt:pohl in 1829) [ClassicSimilarity], result of:
        5.661144 = fieldWeight in 1829, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.05783 = idf(docFreq=13, maxDocs=44218)
          0.625 = fieldNorm(doc=1829)
    

Similar documents (content)

  1. Lischka, K.: 128 Zeichen für die Welt : Vor 40 Jahren schrieben Fachleute das Alphabet des Computers - und schufen damit dem ASCII-Standard (2003) 0.10
    0.09841672 = sum of:
      0.09841672 = product of:
        0.30755228 = sum of:
          0.013883788 = weight(abstract_txt:hier in 391) [ClassicSimilarity], result of:
            0.013883788 = score(doc=391,freq=1.0), product of:
              0.08458059 = queryWeight, product of:
                1.1182855 = boost
                5.252756 = idf(docFreq=628, maxDocs=44218)
                0.014398949 = queryNorm
              0.16414863 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.252756 = idf(docFreq=628, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.018582873 = weight(abstract_txt:etwa in 391) [ClassicSimilarity], result of:
            0.018582873 = score(doc=391,freq=1.0), product of:
              0.102724485 = queryWeight, product of:
                1.2324075 = boost
                5.788804 = idf(docFreq=367, maxDocs=44218)
                0.014398949 = queryNorm
              0.18090013 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.788804 = idf(docFreq=367, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.028464984 = weight(abstract_txt:miteinander in 391) [ClassicSimilarity], result of:
            0.028464984 = score(doc=391,freq=1.0), product of:
              0.13650212 = queryWeight, product of:
                1.4206498 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.014398949 = queryNorm
              0.20853145 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.021533154 = weight(abstract_txt:eine in 391) [ClassicSimilarity], result of:
            0.021533154 = score(doc=391,freq=7.0), product of:
              0.07464164 = queryWeight, product of:
                1.4856719 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.014398949 = queryNorm
              0.28848717 = fieldWeight in 391, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.04418866 = weight(abstract_txt:ursprünglich in 391) [ClassicSimilarity], result of:
            0.04418866 = score(doc=391,freq=1.0), product of:
              0.18300907 = queryWeight, product of:
                1.6449535 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.014398949 = queryNorm
              0.2414561 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.068087675 = weight(abstract_txt:einheitlich in 391) [ClassicSimilarity], result of:
            0.068087675 = score(doc=391,freq=1.0), product of:
              0.24414308 = queryWeight, product of:
                1.8999385 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.014398949 = queryNorm
              0.27888432 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.07118996 = weight(abstract_txt:protokoll in 391) [ClassicSimilarity], result of:
            0.07118996 = score(doc=391,freq=1.0), product of:
              0.25150383 = queryWeight, product of:
                1.9283667 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.014398949 = queryNorm
              0.28305718 = fieldWeight in 391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
          0.041621175 = weight(abstract_txt:wurde in 391) [ClassicSimilarity], result of:
            0.041621175 = score(doc=391,freq=4.0), product of:
              0.139572 = queryWeight, product of:
                2.0315685 = boost
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.014398949 = queryNorm
              0.29820576 = fieldWeight in 391, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.03125 = fieldNorm(doc=391)
        0.32 = coord(8/25)
    
  2. Bohne-Lang, A.: Semantische Metadaten für den Webauftritt einer Bibliothek (2016) 0.09
    0.08924428 = sum of:
      0.08924428 = product of:
        0.37185115 = sum of:
          0.03238944 = weight(abstract_txt:entwickelt in 3337) [ClassicSimilarity], result of:
            0.03238944 = score(doc=3337,freq=1.0), product of:
              0.09372333 = queryWeight, product of:
                1.1771754 = boost
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.014398949 = queryNorm
              0.34558567 = fieldWeight in 3337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
          0.02301991 = weight(abstract_txt:eine in 3337) [ClassicSimilarity], result of:
            0.02301991 = score(doc=3337,freq=2.0), product of:
              0.07464164 = queryWeight, product of:
                1.4856719 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.014398949 = queryNorm
              0.30840576 = fieldWeight in 3337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
          0.08837732 = weight(abstract_txt:ursprünglich in 3337) [ClassicSimilarity], result of:
            0.08837732 = score(doc=3337,freq=1.0), product of:
              0.18300907 = queryWeight, product of:
                1.6449535 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.014398949 = queryNorm
              0.4829122 = fieldWeight in 3337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
          0.041621175 = weight(abstract_txt:wurde in 3337) [ClassicSimilarity], result of:
            0.041621175 = score(doc=3337,freq=1.0), product of:
              0.139572 = queryWeight, product of:
                2.0315685 = boost
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.014398949 = queryNorm
              0.29820576 = fieldWeight in 3337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
          0.081609964 = weight(abstract_txt:graph in 3337) [ClassicSimilarity], result of:
            0.081609964 = score(doc=3337,freq=1.0), product of:
              0.19865733 = queryWeight, product of:
                2.0990136 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.014398949 = queryNorm
              0.4108077 = fieldWeight in 3337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
          0.104833335 = weight(abstract_txt:google in 3337) [ClassicSimilarity], result of:
            0.104833335 = score(doc=3337,freq=2.0), product of:
              0.22090982 = queryWeight, product of:
                2.8575566 = boost
                5.3689504 = idf(docFreq=559, maxDocs=44218)
                0.014398949 = queryNorm
              0.47455263 = fieldWeight in 3337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3689504 = idf(docFreq=559, maxDocs=44218)
                0.0625 = fieldNorm(doc=3337)
        0.24 = coord(6/25)
    
  3. Qualität in der Inhaltserschließung (2021) 0.08
    0.08447376 = sum of:
      0.08447376 = product of:
        0.703948 = sum of:
          0.118852116 = weight(abstract_txt:qualität in 753) [ClassicSimilarity], result of:
            0.118852116 = score(doc=753,freq=3.0), product of:
              0.11798192 = queryWeight, product of:
                1.3207635 = boost
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.014398949 = queryNorm
              1.0073757 = fieldWeight in 753, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.09375 = fieldNorm(doc=753)
          0.21735516 = weight(abstract_txt:normdaten in 753) [ClassicSimilarity], result of:
            0.21735516 = score(doc=753,freq=1.0), product of:
              0.29129183 = queryWeight, product of:
                2.5417166 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.014398949 = queryNorm
              0.74617666 = fieldWeight in 753, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.09375 = fieldNorm(doc=753)
          0.36774072 = weight(abstract_txt:inhaltserschließung in 753) [ClassicSimilarity], result of:
            0.36774072 = score(doc=753,freq=3.0), product of:
              0.31563267 = queryWeight, product of:
                3.0550852 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.014398949 = queryNorm
              1.1650908 = fieldWeight in 753, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.09375 = fieldNorm(doc=753)
        0.12 = coord(3/25)
    
  4. Darstellung der CrissCross-Mappingrelationen im Rahmen des Semantic Web (2010) 0.08
    0.0800979 = sum of:
      0.0800979 = product of:
        0.33374125 = sum of:
          0.0202434 = weight(abstract_txt:entwickelt in 4285) [ClassicSimilarity], result of:
            0.0202434 = score(doc=4285,freq=1.0), product of:
              0.09372333 = queryWeight, product of:
                1.1771754 = boost
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.014398949 = queryNorm
              0.21599105 = fieldWeight in 4285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
          0.03558123 = weight(abstract_txt:miteinander in 4285) [ClassicSimilarity], result of:
            0.03558123 = score(doc=4285,freq=1.0), product of:
              0.13650212 = queryWeight, product of:
                1.4206498 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.014398949 = queryNorm
              0.2606643 = fieldWeight in 4285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
          0.022748547 = weight(abstract_txt:eine in 4285) [ClassicSimilarity], result of:
            0.022748547 = score(doc=4285,freq=5.0), product of:
              0.07464164 = queryWeight, product of:
                1.4856719 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.014398949 = queryNorm
              0.3047702 = fieldWeight in 4285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
          0.026013233 = weight(abstract_txt:wurde in 4285) [ClassicSimilarity], result of:
            0.026013233 = score(doc=4285,freq=1.0), product of:
              0.139572 = queryWeight, product of:
                2.0315685 = boost
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.014398949 = queryNorm
              0.1863786 = fieldWeight in 4285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
          0.07592954 = weight(abstract_txt:verwendet in 4285) [ClassicSimilarity], result of:
            0.07592954 = score(doc=4285,freq=2.0), product of:
              0.2055668 = queryWeight, product of:
                2.1352043 = boost
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.014398949 = queryNorm
              0.36936674 = fieldWeight in 4285, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
          0.1532253 = weight(abstract_txt:inhaltserschließung in 4285) [ClassicSimilarity], result of:
            0.1532253 = score(doc=4285,freq=3.0), product of:
              0.31563267 = queryWeight, product of:
                3.0550852 = boost
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.014398949 = queryNorm
              0.4854545 = fieldWeight in 4285, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.1750984 = idf(docFreq=91, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4285)
        0.24 = coord(6/25)
    
  5. Haffner, A.: Internationalisierung der GND durch das Semantic Web (2012) 0.08
    0.07763694 = sum of:
      0.07763694 = product of:
        0.32348725 = sum of:
          0.02859138 = weight(abstract_txt:qualität in 318) [ClassicSimilarity], result of:
            0.02859138 = score(doc=318,freq=1.0), product of:
              0.11798192 = queryWeight, product of:
                1.3207635 = boost
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.014398949 = queryNorm
              0.24233696 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
          0.03558123 = weight(abstract_txt:miteinander in 318) [ClassicSimilarity], result of:
            0.03558123 = score(doc=318,freq=1.0), product of:
              0.13650212 = queryWeight, product of:
                1.4206498 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.014398949 = queryNorm
              0.2606643 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
          0.022748547 = weight(abstract_txt:eine in 318) [ClassicSimilarity], result of:
            0.022748547 = score(doc=318,freq=5.0), product of:
              0.07464164 = queryWeight, product of:
                1.4856719 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.014398949 = queryNorm
              0.3047702 = fieldWeight in 318, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
          0.026013233 = weight(abstract_txt:wurde in 318) [ClassicSimilarity], result of:
            0.026013233 = score(doc=318,freq=1.0), product of:
              0.139572 = queryWeight, product of:
                2.0315685 = boost
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.014398949 = queryNorm
              0.1863786 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.771292 = idf(docFreq=1017, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
          0.05369029 = weight(abstract_txt:verwendet in 318) [ClassicSimilarity], result of:
            0.05369029 = score(doc=318,freq=1.0), product of:
              0.2055668 = queryWeight, product of:
                2.1352043 = boost
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.014398949 = queryNorm
              0.2611817 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
          0.15686257 = weight(abstract_txt:normdaten in 318) [ClassicSimilarity], result of:
            0.15686257 = score(doc=318,freq=3.0), product of:
              0.29129183 = queryWeight, product of:
                2.5417166 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.014398949 = queryNorm
              0.53850657 = fieldWeight in 318, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0390625 = fieldNorm(doc=318)
        0.24 = coord(6/25)