Document (#37123)

Author
Jersek, T.
Title
Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse
Imprint
Köln : Fachhochschule
Year
2012
Pages
V. 56 S
Abstract
Die Arbeit befasst sich mit der Realisierung und der Durchführung einer automatischen DDCKlassifizierung durch das Indexierungssystem Lingo. Dies geschieht durch die Einbeziehung von Relationen des DFG-Projektes CrissCross, anhand derer Lingo bibliographische Titeldatensätze automatisch klassifiziert. Der dabei verwendete Ansatz wird mit dem üblichen methodischen Vorgehen bei automatischen Klassifizierungssystemen verglichen. Das Klassifizierungsverfahren wird daraufhin anhand einer Testkollektion von bibliographischen Titeldatensätzen der Deutschen Nationalbibliothek (DNB) getestet. Es folgt eine Diskussion der Ergebnisse und eine Bewertung des Klassifizierungssystems.
Content
Diplomarbeit, Studiengang Bibliothekswesen, Fakultät für Informations- und Kommunikationswissenschaften, Fachhochschule Köln.
Theme
Automatisches Klassifizieren
Object
Lingo
DDC

Similar documents (content)

  1. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.23
    0.22893368 = sum of:
      0.22893368 = product of:
        1.1446683 = sum of:
          0.049343955 = weight(abstract_txt:einer in 401) [ClassicSimilarity], result of:
            0.049343955 = score(doc=401,freq=2.0), product of:
              0.071872585 = queryWeight, product of:
                1.0367322 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.017850526 = queryNorm
              0.68654764 = fieldWeight in 401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.125 = fieldNorm(doc=401)
          0.06454508 = weight(abstract_txt:durch in 401) [ClassicSimilarity], result of:
            0.06454508 = score(doc=401,freq=2.0), product of:
              0.08596389 = queryWeight, product of:
                1.1338171 = boost
                4.2473893 = idf(docFreq=1718, maxDocs=44218)
                0.017850526 = queryNorm
              0.7508394 = fieldWeight in 401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2473893 = idf(docFreq=1718, maxDocs=44218)
                0.125 = fieldNorm(doc=401)
          0.13864541 = weight(abstract_txt:ergebnisse in 401) [ClassicSimilarity], result of:
            0.13864541 = score(doc=401,freq=2.0), product of:
              0.14311253 = queryWeight, product of:
                1.4629308 = boost
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.017850526 = queryNorm
              0.968786 = fieldWeight in 401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.125 = fieldNorm(doc=401)
          0.1936887 = weight(abstract_txt:automatischen in 401) [ClassicSimilarity], result of:
            0.1936887 = score(doc=401,freq=1.0), product of:
              0.2253306 = queryWeight, product of:
                1.8356719 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017850526 = queryNorm
              0.8595757 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.125 = fieldNorm(doc=401)
          0.69844514 = weight(abstract_txt:lingo in 401) [ClassicSimilarity], result of:
            0.69844514 = score(doc=401,freq=1.0), product of:
              0.60655373 = queryWeight, product of:
                3.688631 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.017850526 = queryNorm
              1.1514976 = fieldWeight in 401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.125 = fieldNorm(doc=401)
        0.2 = coord(5/25)
    
  2. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.21
    0.21011387 = sum of:
      0.21011387 = product of:
        1.3132117 = sum of:
          0.033937488 = weight(abstract_txt:wird in 3581) [ClassicSimilarity], result of:
            0.033937488 = score(doc=3581,freq=2.0), product of:
              0.06784007 = queryWeight, product of:
                1.0072287 = boost
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.017850526 = queryNorm
              0.50025725 = fieldWeight in 3581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.09375 = fieldNorm(doc=3581)
          0.026168583 = weight(abstract_txt:einer in 3581) [ClassicSimilarity], result of:
            0.026168583 = score(doc=3581,freq=1.0), product of:
              0.071872585 = queryWeight, product of:
                1.0367322 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.017850526 = queryNorm
              0.36409688 = fieldWeight in 3581, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.09375 = fieldNorm(doc=3581)
          0.2054379 = weight(abstract_txt:automatischen in 3581) [ClassicSimilarity], result of:
            0.2054379 = score(doc=3581,freq=2.0), product of:
              0.2253306 = queryWeight, product of:
                1.8356719 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017850526 = queryNorm
              0.9117177 = fieldWeight in 3581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.09375 = fieldNorm(doc=3581)
          1.0476677 = weight(abstract_txt:lingo in 3581) [ClassicSimilarity], result of:
            1.0476677 = score(doc=3581,freq=4.0), product of:
              0.60655373 = queryWeight, product of:
                3.688631 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.017850526 = queryNorm
              1.7272464 = fieldWeight in 3581, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.09375 = fieldNorm(doc=3581)
        0.16 = coord(4/25)
    
  3. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.19
    0.19209032 = sum of:
      0.19209032 = product of:
        0.6860368 = sum of:
          0.022358285 = weight(abstract_txt:wird in 1054) [ClassicSimilarity], result of:
            0.022358285 = score(doc=1054,freq=5.0), product of:
              0.06784007 = queryWeight, product of:
                1.0072287 = boost
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.017850526 = queryNorm
              0.32957345 = fieldWeight in 1054, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.02438114 = weight(abstract_txt:einer in 1054) [ClassicSimilarity], result of:
            0.02438114 = score(doc=1054,freq=5.0), product of:
              0.071872585 = queryWeight, product of:
                1.0367322 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.017850526 = queryNorm
              0.33922726 = fieldWeight in 1054, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.020170337 = weight(abstract_txt:durch in 1054) [ClassicSimilarity], result of:
            0.020170337 = score(doc=1054,freq=2.0), product of:
              0.08596389 = queryWeight, product of:
                1.1338171 = boost
                4.2473893 = idf(docFreq=1718, maxDocs=44218)
                0.017850526 = queryNorm
              0.23463732 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2473893 = idf(docFreq=1718, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.07978597 = weight(abstract_txt:indexierungssystem in 1054) [ClassicSimilarity], result of:
            0.07978597 = score(doc=1054,freq=1.0), product of:
              0.21500982 = queryWeight, product of:
                1.2679412 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.017850526 = queryNorm
              0.37108058 = fieldWeight in 1054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.043326695 = weight(abstract_txt:ergebnisse in 1054) [ClassicSimilarity], result of:
            0.043326695 = score(doc=1054,freq=2.0), product of:
              0.14311253 = queryWeight, product of:
                1.4629308 = boost
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.017850526 = queryNorm
              0.30274564 = fieldWeight in 1054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.0594862 = weight(abstract_txt:anhand in 1054) [ClassicSimilarity], result of:
            0.0594862 = score(doc=1054,freq=3.0), product of:
              0.15443808 = queryWeight, product of:
                1.519715 = boost
                5.6930003 = idf(docFreq=404, maxDocs=44218)
                0.017850526 = queryNorm
              0.38517833 = fieldWeight in 1054, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6930003 = idf(docFreq=404, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
          0.43652824 = weight(abstract_txt:lingo in 1054) [ClassicSimilarity], result of:
            0.43652824 = score(doc=1054,freq=4.0), product of:
              0.60655373 = queryWeight, product of:
                3.688631 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.017850526 = queryNorm
              0.71968603 = fieldWeight in 1054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1054)
        0.28 = coord(7/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15442233 = sum of:
      0.15442233 = product of:
        0.5515083 = sum of:
          0.03918764 = weight(abstract_txt:wird in 4854) [ClassicSimilarity], result of:
            0.03918764 = score(doc=4854,freq=6.0), product of:
              0.06784007 = queryWeight, product of:
                1.0072287 = boost
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.017850526 = queryNorm
              0.5776474 = fieldWeight in 4854, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.08400779 = weight(abstract_txt:klassifizierung in 4854) [ClassicSimilarity], result of:
            0.08400779 = score(doc=4854,freq=1.0), product of:
              0.16266984 = queryWeight, product of:
                1.1028678 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.017850526 = queryNorm
              0.5164313 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.09984139 = weight(abstract_txt:klassifiziert in 4854) [ClassicSimilarity], result of:
            0.09984139 = score(doc=4854,freq=1.0), product of:
              0.18251605 = queryWeight, product of:
                1.1682088 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.017850526 = queryNorm
              0.547028 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.12765755 = weight(abstract_txt:titeldatensätze in 4854) [ClassicSimilarity], result of:
            0.12765755 = score(doc=4854,freq=1.0), product of:
              0.21500982 = queryWeight, product of:
                1.2679412 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.017850526 = queryNorm
              0.5937289 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.049018558 = weight(abstract_txt:ergebnisse in 4854) [ClassicSimilarity], result of:
            0.049018558 = score(doc=4854,freq=1.0), product of:
              0.14311253 = queryWeight, product of:
                1.4629308 = boost
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.017850526 = queryNorm
              0.34251758 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.054951 = weight(abstract_txt:anhand in 4854) [ClassicSimilarity], result of:
            0.054951 = score(doc=4854,freq=1.0), product of:
              0.15443808 = queryWeight, product of:
                1.519715 = boost
                5.6930003 = idf(docFreq=404, maxDocs=44218)
                0.017850526 = queryNorm
              0.35581252 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6930003 = idf(docFreq=404, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
          0.09684435 = weight(abstract_txt:automatischen in 4854) [ClassicSimilarity], result of:
            0.09684435 = score(doc=4854,freq=1.0), product of:
              0.2253306 = queryWeight, product of:
                1.8356719 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017850526 = queryNorm
              0.42978784 = fieldWeight in 4854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=4854)
        0.28 = coord(7/25)
    
  5. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.15
    0.1517653 = sum of:
      0.1517653 = product of:
        0.63235545 = sum of:
          0.011998715 = weight(abstract_txt:wird in 3284) [ClassicSimilarity], result of:
            0.011998715 = score(doc=3284,freq=1.0), product of:
              0.06784007 = queryWeight, product of:
                1.0072287 = boost
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.017850526 = queryNorm
              0.17686766 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.773177 = idf(docFreq=2761, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.14088535 = weight(abstract_txt:klassifizierung in 3284) [ClassicSimilarity], result of:
            0.14088535 = score(doc=3284,freq=5.0), product of:
              0.16266984 = queryWeight, product of:
                1.1028678 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.017850526 = queryNorm
              0.8660816 = fieldWeight in 3284, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.07488104 = weight(abstract_txt:klassifiziert in 3284) [ClassicSimilarity], result of:
            0.07488104 = score(doc=3284,freq=1.0), product of:
              0.18251605 = queryWeight, product of:
                1.1682088 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.017850526 = queryNorm
              0.410271 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.09574316 = weight(abstract_txt:titeldatensätze in 3284) [ClassicSimilarity], result of:
            0.09574316 = score(doc=3284,freq=1.0), product of:
              0.21500982 = queryWeight, product of:
                1.2679412 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.017850526 = queryNorm
              0.44529667 = fieldWeight in 3284, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.14643426 = weight(abstract_txt:titeldatensätzen in 3284) [ClassicSimilarity], result of:
            0.14643426 = score(doc=3284,freq=2.0), product of:
              0.22653654 = queryWeight, product of:
                1.3014848 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.017850526 = queryNorm
              0.6464046 = fieldWeight in 3284, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
          0.16241293 = weight(abstract_txt:automatischen in 3284) [ClassicSimilarity], result of:
            0.16241293 = score(doc=3284,freq=5.0), product of:
              0.2253306 = queryWeight, product of:
                1.8356719 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017850526 = queryNorm
              0.72077614 = fieldWeight in 3284, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.046875 = fieldNorm(doc=3284)
        0.24 = coord(6/25)