Document (#39472)

Author
Kasprzik, A.
Title
Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte
Source
Perspektive Bibliothek. 3(2014) H.1, S.85-110
Year
2014
Abstract
Das sprunghafte Anwachsen der Menge digital verfügbarer Dokumente gepaart mit dem Zeit- und Personalmangel an wissenschaftlichen Bibliotheken legt den Einsatz von halb- oder vollautomatischen Verfahren für die verbale und klassifikatorische Inhaltserschließung nahe. Nach einer kurzen allgemeinen Einführung in die gängige Methodik beleuchtet dieser Artikel eine Reihe von Projekten zur automatisierten Klassifizierung aus dem Zeitraum 2007-2012 und aus dem deutschsprachigen Raum. Ein Großteil der vorgestellten Projekte verwendet Methoden des Maschinellen Lernens aus der Künstlichen Intelligenz, arbeitet meist mit angepassten Versionen einer kommerziellen Software und bezieht sich in der Regel auf die Dewey Decimal Classification (DDC). Als Datengrundlage dienen Metadatensätze, Abstracs, Inhaltsverzeichnisse und Volltexte in diversen Datenformaten. Die abschließende Analyse enthält eine Anordnung der Projekte nach einer Reihe von verschiedenen Kriterien und eine Zusammenfassung der aktuellen Lage und der größten Herausfordungen für automatisierte Klassifizierungsverfahren.
Content
Vgl.: https://journals.ub.uni-heidelberg.de/index.php/bibliothek/article/view/14022.
Theme
Automatisches Indexieren
Automatisches Klassifizieren

Similar documents (author)

  1. Kasprzik, A.: Implementierung eines Hierarchisierungsalgorithmus' für die Konstanzer Systematik : Projektbericht (2013) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:kasprzik in 2742) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 2742, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=2742)
    
  2. Kasprzik, A.: Vorläufer der Internationalen Katalogisierungsprinzipien (2014) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:kasprzik in 3084) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 3084, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=3084)
    
  3. Kasprzik, A.: Voraussetzungen und Anwendungspotentiale einer präzisen Sacherschließung aus Sicht der Wissenschaft (2018) 6.00
    5.9971275 = sum of:
      5.9971275 = weight(author_txt:kasprzik in 196) [ClassicSimilarity], result of:
        5.9971275 = fieldWeight in 196, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.625 = fieldNorm(doc=196)
    
  4. Kasprzik, A.; Kett, J.: Vorschläge für eine Weiterentwicklung der Sacherschließung und Schritte zur fortgesetzten strukturellen Aufwertung der GND (2018) 4.80
    4.797702 = sum of:
      4.797702 = weight(author_txt:kasprzik in 600) [ClassicSimilarity], result of:
        4.797702 = fieldWeight in 600, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.5 = fieldNorm(doc=600)
    
  5. Auer, S.; Kasprzik, A.; Sens, I.: Von dokumentenbasierten zu wissensbasierten Informationsflüssen : Die Rolle wissenschaftlicher Bibliotheken im Transformationsprozess. Teil 1: Vor einer Revolution der wissenschaftlichen Kommunikation (2019) 3.60
    3.5982764 = sum of:
      3.5982764 = weight(author_txt:kasprzik in 242) [ClassicSimilarity], result of:
        3.5982764 = fieldWeight in 242, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.595404 = idf(docFreq=7, maxDocs=43254)
          0.375 = fieldNorm(doc=242)
    

Similar documents (content)

  1. Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.16
    0.16185579 = sum of:
      0.16185579 = product of:
        0.67439914 = sum of:
          0.08636864 = weight(abstract_txt:klassifikatorische in 4488) [ClassicSimilarity], result of:
            0.08636864 = score(doc=4488,freq=1.0), product of:
              0.18089794 = queryWeight, product of:
                1.0862417 = boost
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.019075358 = queryNorm
              0.47744405 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
          0.050720003 = weight(abstract_txt:analyse in 4488) [ClassicSimilarity], result of:
            0.050720003 = score(doc=4488,freq=1.0), product of:
              0.15983027 = queryWeight, product of:
                1.4439567 = boost
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.019075358 = queryNorm
              0.31733665 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
          0.04601644 = weight(abstract_txt:einer in 4488) [ClassicSimilarity], result of:
            0.04601644 = score(doc=4488,freq=4.0), product of:
              0.10801696 = queryWeight, product of:
                1.4538388 = boost
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.019075358 = queryNorm
              0.42601123 = fieldWeight in 4488, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
          0.02256156 = weight(abstract_txt:eine in 4488) [ClassicSimilarity], result of:
            0.02256156 = score(doc=4488,freq=1.0), product of:
              0.11734437 = queryWeight, product of:
                1.749729 = boost
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.019075358 = queryNorm
              0.19226792 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
          0.20545423 = weight(abstract_txt:klassifizierung in 4488) [ClassicSimilarity], result of:
            0.20545423 = score(doc=4488,freq=2.0), product of:
              0.32235885 = queryWeight, product of:
                2.0506637 = boost
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.019075358 = queryNorm
              0.6373463 = fieldWeight in 4488, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
          0.26327828 = weight(abstract_txt:projekte in 4488) [ClassicSimilarity], result of:
            0.26327828 = score(doc=4488,freq=5.0), product of:
              0.32076722 = queryWeight, product of:
                2.505332 = boost
                6.7120004 = idf(docFreq=142, maxDocs=43254)
                0.019075358 = queryNorm
              0.82077676 = fieldWeight in 4488, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7120004 = idf(docFreq=142, maxDocs=43254)
                0.0546875 = fieldNorm(doc=4488)
        0.24 = coord(6/25)
    
  2. Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.16
    0.16185579 = sum of:
      0.16185579 = product of:
        0.67439914 = sum of:
          0.08636864 = weight(abstract_txt:klassifikatorische in 1164) [ClassicSimilarity], result of:
            0.08636864 = score(doc=1164,freq=1.0), product of:
              0.18089794 = queryWeight, product of:
                1.0862417 = boost
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.019075358 = queryNorm
              0.47744405 = fieldWeight in 1164, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
          0.050720003 = weight(abstract_txt:analyse in 1164) [ClassicSimilarity], result of:
            0.050720003 = score(doc=1164,freq=1.0), product of:
              0.15983027 = queryWeight, product of:
                1.4439567 = boost
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.019075358 = queryNorm
              0.31733665 = fieldWeight in 1164, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
          0.04601644 = weight(abstract_txt:einer in 1164) [ClassicSimilarity], result of:
            0.04601644 = score(doc=1164,freq=4.0), product of:
              0.10801696 = queryWeight, product of:
                1.4538388 = boost
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.019075358 = queryNorm
              0.42601123 = fieldWeight in 1164, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
          0.02256156 = weight(abstract_txt:eine in 1164) [ClassicSimilarity], result of:
            0.02256156 = score(doc=1164,freq=1.0), product of:
              0.11734437 = queryWeight, product of:
                1.749729 = boost
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.019075358 = queryNorm
              0.19226792 = fieldWeight in 1164, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
          0.20545423 = weight(abstract_txt:klassifizierung in 1164) [ClassicSimilarity], result of:
            0.20545423 = score(doc=1164,freq=2.0), product of:
              0.32235885 = queryWeight, product of:
                2.0506637 = boost
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.019075358 = queryNorm
              0.6373463 = fieldWeight in 1164, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
          0.26327828 = weight(abstract_txt:projekte in 1164) [ClassicSimilarity], result of:
            0.26327828 = score(doc=1164,freq=5.0), product of:
              0.32076722 = queryWeight, product of:
                2.505332 = boost
                6.7120004 = idf(docFreq=142, maxDocs=43254)
                0.019075358 = queryNorm
              0.82077676 = fieldWeight in 1164, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7120004 = idf(docFreq=142, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1164)
        0.24 = coord(6/25)
    
  3. Panzer, M.: Semantische Integration heterogener und unterschiedlichsprachiger Wissensorganisationssysteme : CrissCross und jenseits (2008) 0.10
    0.095430374 = sum of:
      0.095430374 = product of:
        0.39762658 = sum of:
          0.100319475 = weight(abstract_txt:verbale in 800) [ClassicSimilarity], result of:
            0.100319475 = score(doc=800,freq=1.0), product of:
              0.15758628 = queryWeight, product of:
                1.0138386 = boost
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.019075358 = queryNorm
              0.6366003 = fieldWeight in 800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
          0.1154594 = weight(abstract_txt:automatisierten in 800) [ClassicSimilarity], result of:
            0.1154594 = score(doc=800,freq=1.0), product of:
              0.17306705 = queryWeight, product of:
                1.0624704 = boost
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.019075358 = queryNorm
              0.6671368 = fieldWeight in 800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
          0.030940453 = weight(abstract_txt:nach in 800) [ClassicSimilarity], result of:
            0.030940453 = score(doc=800,freq=1.0), product of:
              0.09063362 = queryWeight, product of:
                1.0873499 = boost
                4.3696566 = idf(docFreq=1487, maxDocs=43254)
                0.019075358 = queryNorm
              0.3413794 = fieldWeight in 800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3696566 = idf(docFreq=1487, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
          0.07245714 = weight(abstract_txt:analyse in 800) [ClassicSimilarity], result of:
            0.07245714 = score(doc=800,freq=1.0), product of:
              0.15983027 = queryWeight, product of:
                1.4439567 = boost
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.019075358 = queryNorm
              0.45333806 = fieldWeight in 800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
          0.032868885 = weight(abstract_txt:einer in 800) [ClassicSimilarity], result of:
            0.032868885 = score(doc=800,freq=1.0), product of:
              0.10801696 = queryWeight, product of:
                1.4538388 = boost
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.019075358 = queryNorm
              0.30429375 = fieldWeight in 800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
          0.045581233 = weight(abstract_txt:eine in 800) [ClassicSimilarity], result of:
            0.045581233 = score(doc=800,freq=2.0), product of:
              0.11734437 = queryWeight, product of:
                1.749729 = boost
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.019075358 = queryNorm
              0.38843986 = fieldWeight in 800, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.078125 = fieldNorm(doc=800)
        0.24 = coord(6/25)
    
  4. Müller, C.; Sternitzke, N.; Stratmann, R.; Parschik, T.: Kataloganreicherung und Zeitschriftenerschließung mit MyBib eDoc und C-3 am Ibero-Amerikanischen Institut, Preußischer Kulturbesitz : Neue Verfahren zur Optimierung der bibliografischen Nachweissituation in einer großen Spezialbibliothek (2010) 0.09
    0.09287511 = sum of:
      0.09287511 = product of:
        0.38697964 = sum of:
          0.06927563 = weight(abstract_txt:automatisierten in 500) [ClassicSimilarity], result of:
            0.06927563 = score(doc=500,freq=1.0), product of:
              0.17306705 = queryWeight, product of:
                1.0624704 = boost
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.019075358 = queryNorm
              0.40028206 = fieldWeight in 500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
          0.121872514 = weight(abstract_txt:inhaltsverzeichnisse in 500) [ClassicSimilarity], result of:
            0.121872514 = score(doc=500,freq=3.0), product of:
              0.17487356 = queryWeight, product of:
                1.0680012 = boost
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.019075358 = queryNorm
              0.6969179 = fieldWeight in 500, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.583802 = idf(docFreq=21, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
          0.074030265 = weight(abstract_txt:gängige in 500) [ClassicSimilarity], result of:
            0.074030265 = score(doc=500,freq=1.0), product of:
              0.18089794 = queryWeight, product of:
                1.0862417 = boost
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.019075358 = queryNorm
              0.40923777 = fieldWeight in 500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.730406 = idf(docFreq=18, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
          0.026253846 = weight(abstract_txt:nach in 500) [ClassicSimilarity], result of:
            0.026253846 = score(doc=500,freq=2.0), product of:
              0.09063362 = queryWeight, product of:
                1.0873499 = boost
                4.3696566 = idf(docFreq=1487, maxDocs=43254)
                0.019075358 = queryNorm
              0.28967005 = fieldWeight in 500, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3696566 = idf(docFreq=1487, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
          0.034158345 = weight(abstract_txt:einer in 500) [ClassicSimilarity], result of:
            0.034158345 = score(doc=500,freq=3.0), product of:
              0.10801696 = queryWeight, product of:
                1.4538388 = boost
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.019075358 = queryNorm
              0.3162313 = fieldWeight in 500, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
          0.06138902 = weight(abstract_txt:reihe in 500) [ClassicSimilarity], result of:
            0.06138902 = score(doc=500,freq=1.0), product of:
              0.20117061 = queryWeight, product of:
                1.6199698 = boost
                6.510059 = idf(docFreq=174, maxDocs=43254)
                0.019075358 = queryNorm
              0.305159 = fieldWeight in 500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.510059 = idf(docFreq=174, maxDocs=43254)
                0.046875 = fieldNorm(doc=500)
        0.24 = coord(6/25)
    
  5. Kaizik, A.; Gödert, W.; Milanesi, C.: Erfahrungen und Ergebnisse aus der Evaluierung des EU-Projektes EULER im Rahmen des an der FH Köln angesiedelten Projektes EJECT (Evaluation von Subject Gateways des World Wide Web (2001) 0.08
    0.07779085 = sum of:
      0.07779085 = product of:
        0.38895425 = sum of:
          0.10142611 = weight(abstract_txt:methodik in 802) [ClassicSimilarity], result of:
            0.10142611 = score(doc=802,freq=1.0), product of:
              0.15874305 = queryWeight, product of:
                1.017553 = boost
                8.178337 = idf(docFreq=32, maxDocs=43254)
                0.019075358 = queryNorm
              0.6389326 = fieldWeight in 802, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.178337 = idf(docFreq=32, maxDocs=43254)
                0.078125 = fieldNorm(doc=802)
          0.07245714 = weight(abstract_txt:analyse in 802) [ClassicSimilarity], result of:
            0.07245714 = score(doc=802,freq=1.0), product of:
              0.15983027 = queryWeight, product of:
                1.4439567 = boost
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.019075358 = queryNorm
              0.45333806 = fieldWeight in 802, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.802727 = idf(docFreq=354, maxDocs=43254)
                0.078125 = fieldNorm(doc=802)
          0.05693058 = weight(abstract_txt:einer in 802) [ClassicSimilarity], result of:
            0.05693058 = score(doc=802,freq=3.0), product of:
              0.10801696 = queryWeight, product of:
                1.4538388 = boost
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.019075358 = queryNorm
              0.5270522 = fieldWeight in 802, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.89496 = idf(docFreq=2391, maxDocs=43254)
                0.078125 = fieldNorm(doc=802)
          0.10231504 = weight(abstract_txt:reihe in 802) [ClassicSimilarity], result of:
            0.10231504 = score(doc=802,freq=1.0), product of:
              0.20117061 = queryWeight, product of:
                1.6199698 = boost
                6.510059 = idf(docFreq=174, maxDocs=43254)
                0.019075358 = queryNorm
              0.5085983 = fieldWeight in 802, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.510059 = idf(docFreq=174, maxDocs=43254)
                0.078125 = fieldNorm(doc=802)
          0.05582538 = weight(abstract_txt:eine in 802) [ClassicSimilarity], result of:
            0.05582538 = score(doc=802,freq=3.0), product of:
              0.11734437 = queryWeight, product of:
                1.749729 = boost
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.019075358 = queryNorm
              0.47573972 = fieldWeight in 802, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5157564 = idf(docFreq=3494, maxDocs=43254)
                0.078125 = fieldNorm(doc=802)
        0.2 = coord(5/25)