Document (#42211)

Author
Wiesenmüller, H.
Title
Maschinelle Indexierung am Beispiel der DNB : Analyse und Entwicklungmöglichkeiten
Source
o-bib: Das offene Bibliotheksjournal. 5(2018) Nr.4, S.141-153
Year
2018
Abstract
Der Beitrag untersucht die Ergebnisse des bei der Deutschen Nationalbibliothek (DNB) eingesetzten Verfahrens zur automatischen Vergabe von Schlagwörtern. Seit 2017 kommt dieses auch bei Printausgaben der Reihen B und H der Deutschen Nationalbibliografie zum Einsatz. Die zentralen Problembereiche werden dargestellt und an Beispielen illustriert - beispielsweise dass nicht alle im Inhaltsverzeichnis vorkommenden Wörter tatsächlich thematische Aspekte ausdrücken und dass die Software sehr häufig Körperschaften und andere "Named entities" nicht erkennt. Die maschinell generierten Ergebnisse sind derzeit sehr unbefriedigend. Es werden Überlegungen für mögliche Verbesserungen und sinnvolle Strategien angestellt.
Content
Vortrag anlässlich des 107. Deutschen Bibliothekartages 2018 in Berlin, Themenkreis "Fokus Erschließen & Bewahren". https://www.o-bib.de/article/view/5396. https://doi.org/10.5282/o-bib/2018H4S141-153.
Theme
Automatisches Indexieren
Object
DNB
Location
D
Aid
Digitaler Assistent (Averbis)

Similar documents (author)

  1. Wiesenmüller, H.: Gewogen und für zu leicht befunden : die Ergebnisse des RDA Tests in den USA (2011) 4.98
    4.9757953 = sum of:
      4.9757953 = weight(author_txt:wiesenmüller in 661) [ClassicSimilarity], result of:
        4.9757953 = fieldWeight in 661, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.9612727 = idf(docFreq=40, maxDocs=43254)
          0.625 = fieldNorm(doc=661)
    
  2. Wiesenmüller, H.: ¬Das Konzept der "Virtuellen Bibliothek" im deutschen Bibliothekswesen der 1990er Jahre (2000) 4.98
    4.9757953 = sum of:
      4.9757953 = weight(author_txt:wiesenmüller in 2124) [ClassicSimilarity], result of:
        4.9757953 = fieldWeight in 2124, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.9612727 = idf(docFreq=40, maxDocs=43254)
          0.625 = fieldNorm(doc=2124)
    
  3. Wiesenmüller, H.: Von Fröschen und Strategen : Ein kleiner Leitfaden zur AACR2-Debatte (2002) 4.98
    4.9757953 = sum of:
      4.9757953 = weight(author_txt:wiesenmüller in 2637) [ClassicSimilarity], result of:
        4.9757953 = fieldWeight in 2637, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.9612727 = idf(docFreq=40, maxDocs=43254)
          0.625 = fieldNorm(doc=2637)
    
  4. Wiesenmüller, H.: Versuch eines Fazits (2002) 4.98
    4.9757953 = sum of:
      4.9757953 = weight(author_txt:wiesenmüller in 3101) [ClassicSimilarity], result of:
        4.9757953 = fieldWeight in 3101, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.9612727 = idf(docFreq=40, maxDocs=43254)
          0.625 = fieldNorm(doc=3101)
    
  5. Wiesenmüller, H.: Langzeitarchivierung von Online-Publikationen an Regionalbibliotheken : Das Projekt 'Baden-Württembergisches Online-Archiv' (BOA) (2004) 4.98
    4.9757953 = sum of:
      4.9757953 = weight(author_txt:wiesenmüller in 4284) [ClassicSimilarity], result of:
        4.9757953 = fieldWeight in 4284, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.9612727 = idf(docFreq=40, maxDocs=43254)
          0.625 = fieldNorm(doc=4284)
    

Similar documents (content)

  1. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.15
    0.15074584 = sum of:
      0.15074584 = product of:
        0.538378 = sum of:
          0.023978727 = weight(abstract_txt:nicht in 3953) [ClassicSimilarity], result of:
            0.023978727 = score(doc=3953,freq=3.0), product of:
              0.08942294 = queryWeight, product of:
                1.0130222 = boost
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.02227273 = queryNorm
              0.2681496 = fieldWeight in 3953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.070324235 = weight(abstract_txt:angestellt in 3953) [ClassicSimilarity], result of:
            0.070324235 = score(doc=3953,freq=1.0), product of:
              0.2097323 = queryWeight, product of:
                1.097014 = boost
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.02227273 = queryNorm
              0.33530477 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.072692886 = weight(abstract_txt:reihen in 3953) [ClassicSimilarity], result of:
            0.072692886 = score(doc=3953,freq=1.0), product of:
              0.21441568 = queryWeight, product of:
                1.1091946 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.02227273 = queryNorm
              0.33902782 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.18462428 = weight(abstract_txt:nationalbibliografie in 3953) [ClassicSimilarity], result of:
            0.18462428 = score(doc=3953,freq=6.0), product of:
              0.21965313 = queryWeight, product of:
                1.1226598 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.02227273 = queryNorm
              0.8405265 = fieldWeight in 3953, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.036387328 = weight(abstract_txt:dass in 3953) [ClassicSimilarity], result of:
            0.036387328 = score(doc=3953,freq=3.0), product of:
              0.11808597 = queryWeight, product of:
                1.164109 = boost
                4.5544004 = idf(docFreq=1236, maxDocs=43254)
                0.02227273 = queryNorm
              0.3081427 = fieldWeight in 3953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5544004 = idf(docFreq=1236, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.11335066 = weight(abstract_txt:deutschen in 3953) [ClassicSimilarity], result of:
            0.11335066 = score(doc=3953,freq=14.0), product of:
              0.15072268 = queryWeight, product of:
                1.3151758 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.02227273 = queryNorm
              0.7520478 = fieldWeight in 3953, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
          0.037019867 = weight(abstract_txt:ergebnisse in 3953) [ClassicSimilarity], result of:
            0.037019867 = score(doc=3953,freq=1.0), product of:
              0.17227748 = queryWeight, product of:
                1.4060758 = boost
                5.501059 = idf(docFreq=479, maxDocs=43254)
                0.02227273 = queryNorm
              0.21488512 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.501059 = idf(docFreq=479, maxDocs=43254)
                0.0390625 = fieldNorm(doc=3953)
        0.28 = coord(7/25)
    
  2. Lepsky, K.: Automatische Indexierung des Reallexikons zur Deutschen Kunstgeschichte (2006) 0.11
    0.10794867 = sum of:
      0.10794867 = product of:
        0.38553095 = sum of:
          0.0499157 = weight(abstract_txt:nicht in 1081) [ClassicSimilarity], result of:
            0.0499157 = score(doc=1081,freq=13.0), product of:
              0.08942294 = queryWeight, product of:
                1.0130222 = boost
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.02227273 = queryNorm
              0.5581979 = fieldWeight in 1081, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.057728443 = weight(abstract_txt:sinnvolle in 1081) [ClassicSimilarity], result of:
            0.057728443 = score(doc=1081,freq=1.0), product of:
              0.18387465 = queryWeight, product of:
                1.0271655 = boost
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.02227273 = queryNorm
              0.31395543 = fieldWeight in 1081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.06375066 = weight(abstract_txt:maschinell in 1081) [ClassicSimilarity], result of:
            0.06375066 = score(doc=1081,freq=1.0), product of:
              0.19644988 = queryWeight, product of:
                1.0617087 = boost
                8.307549 = idf(docFreq=28, maxDocs=43254)
                0.02227273 = queryNorm
              0.32451364 = fieldWeight in 1081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.307549 = idf(docFreq=28, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.042016473 = weight(abstract_txt:dass in 1081) [ClassicSimilarity], result of:
            0.042016473 = score(doc=1081,freq=4.0), product of:
              0.11808597 = queryWeight, product of:
                1.164109 = boost
                4.5544004 = idf(docFreq=1236, maxDocs=43254)
                0.02227273 = queryNorm
              0.35581255 = fieldWeight in 1081, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5544004 = idf(docFreq=1236, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.08877313 = weight(abstract_txt:inhaltsverzeichnis in 1081) [ClassicSimilarity], result of:
            0.08877313 = score(doc=1081,freq=1.0), product of:
              0.24497192 = queryWeight, product of:
                1.1855985 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.02227273 = queryNorm
              0.36238086 = fieldWeight in 1081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.042842522 = weight(abstract_txt:deutschen in 1081) [ClassicSimilarity], result of:
            0.042842522 = score(doc=1081,freq=2.0), product of:
              0.15072268 = queryWeight, product of:
                1.3151758 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.02227273 = queryNorm
              0.28424734 = fieldWeight in 1081, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
          0.040504053 = weight(abstract_txt:sehr in 1081) [ClassicSimilarity], result of:
            0.040504053 = score(doc=1081,freq=1.0), product of:
              0.1829241 = queryWeight, product of:
                1.4488719 = boost
                5.668492 = idf(docFreq=405, maxDocs=43254)
                0.02227273 = queryNorm
              0.22142546 = fieldWeight in 1081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.668492 = idf(docFreq=405, maxDocs=43254)
                0.0390625 = fieldNorm(doc=1081)
        0.28 = coord(7/25)
    
  3. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.10618987 = sum of:
      0.10618987 = product of:
        0.53094935 = sum of:
          0.019381775 = weight(abstract_txt:nicht in 3797) [ClassicSimilarity], result of:
            0.019381775 = score(doc=3797,freq=1.0), product of:
              0.08942294 = queryWeight, product of:
                1.0130222 = boost
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.02227273 = queryNorm
              0.21674275 = fieldWeight in 3797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3797)
          0.09845394 = weight(abstract_txt:angestellt in 3797) [ClassicSimilarity], result of:
            0.09845394 = score(doc=3797,freq=1.0), product of:
              0.2097323 = queryWeight, product of:
                1.097014 = boost
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.02227273 = queryNorm
              0.4694267 = fieldWeight in 3797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3797)
          0.14392456 = weight(abstract_txt:reihen in 3797) [ClassicSimilarity], result of:
            0.14392456 = score(doc=3797,freq=2.0), product of:
              0.21441568 = queryWeight, product of:
                1.1091946 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.02227273 = queryNorm
              0.67124087 = fieldWeight in 3797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3797)
          0.14923002 = weight(abstract_txt:nationalbibliografie in 3797) [ClassicSimilarity], result of:
            0.14923002 = score(doc=3797,freq=2.0), product of:
              0.21965313 = queryWeight, product of:
                1.1226598 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.02227273 = queryNorm
              0.6793895 = fieldWeight in 3797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3797)
          0.11995905 = weight(abstract_txt:deutschen in 3797) [ClassicSimilarity], result of:
            0.11995905 = score(doc=3797,freq=8.0), product of:
              0.15072268 = queryWeight, product of:
                1.3151758 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.02227273 = queryNorm
              0.7958925 = fieldWeight in 3797, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3797)
        0.2 = coord(5/25)
    
  4. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.10618987 = sum of:
      0.10618987 = product of:
        0.53094935 = sum of:
          0.019381775 = weight(abstract_txt:nicht in 4035) [ClassicSimilarity], result of:
            0.019381775 = score(doc=4035,freq=1.0), product of:
              0.08942294 = queryWeight, product of:
                1.0130222 = boost
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.02227273 = queryNorm
              0.21674275 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.963296 = idf(docFreq=2233, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4035)
          0.09845394 = weight(abstract_txt:angestellt in 4035) [ClassicSimilarity], result of:
            0.09845394 = score(doc=4035,freq=1.0), product of:
              0.2097323 = queryWeight, product of:
                1.097014 = boost
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.02227273 = queryNorm
              0.4694267 = fieldWeight in 4035, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4035)
          0.14392456 = weight(abstract_txt:reihen in 4035) [ClassicSimilarity], result of:
            0.14392456 = score(doc=4035,freq=2.0), product of:
              0.21441568 = queryWeight, product of:
                1.1091946 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.02227273 = queryNorm
              0.67124087 = fieldWeight in 4035, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4035)
          0.14923002 = weight(abstract_txt:nationalbibliografie in 4035) [ClassicSimilarity], result of:
            0.14923002 = score(doc=4035,freq=2.0), product of:
              0.21965313 = queryWeight, product of:
                1.1226598 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.02227273 = queryNorm
              0.6793895 = fieldWeight in 4035, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4035)
          0.11995905 = weight(abstract_txt:deutschen in 4035) [ClassicSimilarity], result of:
            0.11995905 = score(doc=4035,freq=8.0), product of:
              0.15072268 = queryWeight, product of:
                1.3151758 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.02227273 = queryNorm
              0.7958925 = fieldWeight in 4035, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4035)
        0.2 = coord(5/25)
    
  5. Lepsky, K.: Auf dem Weg zur automatischen Inhaltserschließung? : Das DFG-Projekt MILOS und seine Ergebnisse (1997) 0.10
    0.10269952 = sum of:
      0.10269952 = product of:
        0.641872 = sum of:
          0.16163965 = weight(abstract_txt:sinnvolle in 2012) [ClassicSimilarity], result of:
            0.16163965 = score(doc=2012,freq=1.0), product of:
              0.18387465 = queryWeight, product of:
                1.0271655 = boost
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.02227273 = queryNorm
              0.8790752 = fieldWeight in 2012, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.109375 = fieldNorm(doc=2012)
          0.29175285 = weight(abstract_txt:verfahrens in 2012) [ClassicSimilarity], result of:
            0.29175285 = score(doc=2012,freq=3.0), product of:
              0.18899903 = queryWeight, product of:
                1.04138 = boost
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.02227273 = queryNorm
              1.5436738 = fieldWeight in 2012, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.109375 = fieldNorm(doc=2012)
          0.08482386 = weight(abstract_txt:deutschen in 2012) [ClassicSimilarity], result of:
            0.08482386 = score(doc=2012,freq=1.0), product of:
              0.15072268 = queryWeight, product of:
                1.3151758 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.02227273 = queryNorm
              0.562781 = fieldWeight in 2012, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.109375 = fieldNorm(doc=2012)
          0.10365562 = weight(abstract_txt:ergebnisse in 2012) [ClassicSimilarity], result of:
            0.10365562 = score(doc=2012,freq=1.0), product of:
              0.17227748 = queryWeight, product of:
                1.4060758 = boost
                5.501059 = idf(docFreq=479, maxDocs=43254)
                0.02227273 = queryNorm
              0.6016783 = fieldWeight in 2012, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.501059 = idf(docFreq=479, maxDocs=43254)
                0.109375 = fieldNorm(doc=2012)
        0.16 = coord(4/25)