Document (#37124)

Author
Jersek, T.
Title
Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse
Imprint
Köln : Fachhochschule
Year
2012
Pages
V. 56 S
Abstract
Die Arbeit befasst sich mit der Realisierung und der Durchführung einer automatischen DDCKlassifizierung durch das Indexierungssystem Lingo. Dies geschieht durch die Einbeziehung von Relationen des DFG-Projektes CrissCross, anhand derer Lingo bibliographische Titeldatensätze automatisch klassifiziert. Der dabei verwendete Ansatz wird mit dem üblichen methodischen Vorgehen bei automatischen Klassifizierungssystemen verglichen. Das Klassifizierungsverfahren wird daraufhin anhand einer Testkollektion von bibliographischen Titeldatensätzen der Deutschen Nationalbibliothek (DNB) getestet. Es folgt eine Diskussion der Ergebnisse und eine Bewertung des Klassifizierungssystems.
Content
Diplomarbeit, Studiengang Bibliothekswesen, Fakultät für Informations- und Kommunikationswissenschaften, Fachhochschule Köln.
Theme
Automatisches Klassifizieren
Object
Lingo
DDC

Similar documents (content)

  1. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.23
    0.22869475 = sum of:
      0.22869475 = product of:
        1.1434737 = sum of:
          0.050294667 = weight(abstract_txt:einer in 1402) [ClassicSimilarity], result of:
            0.050294667 = score(doc=1402,freq=2.0), product of:
              0.07284914 = queryWeight, product of:
                1.0369797 = boost
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.01798795 = queryNorm
              0.6903948 = fieldWeight in 1402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.125 = fieldNorm(doc=1402)
          0.06581715 = weight(abstract_txt:durch in 1402) [ClassicSimilarity], result of:
            0.06581715 = score(doc=1402,freq=2.0), product of:
              0.08715704 = queryWeight, product of:
                1.1342512 = boost
                4.2718062 = idf(docFreq=1615, maxDocs=42596)
                0.01798795 = queryNorm
              0.7551558 = fieldWeight in 1402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2718062 = idf(docFreq=1615, maxDocs=42596)
                0.125 = fieldNorm(doc=1402)
          0.14132103 = weight(abstract_txt:ergebnisse in 1402) [ClassicSimilarity], result of:
            0.14132103 = score(doc=1402,freq=2.0), product of:
              0.14505999 = queryWeight, product of:
                1.4632949 = boost
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.01798795 = queryNorm
              0.97422475 = fieldWeight in 1402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.125 = fieldNorm(doc=1402)
          0.19445941 = weight(abstract_txt:automatischen in 1402) [ClassicSimilarity], result of:
            0.19445941 = score(doc=1402,freq=1.0), product of:
              0.22610271 = queryWeight, product of:
                1.8268837 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.01798795 = queryNorm
              0.860049 = fieldWeight in 1402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.125 = fieldNorm(doc=1402)
          0.6915815 = weight(abstract_txt:lingo in 1402) [ClassicSimilarity], result of:
            0.6915815 = score(doc=1402,freq=1.0), product of:
              0.6030395 = queryWeight, product of:
                3.6540673 = boost
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.01798795 = queryNorm
              1.1468261 = fieldWeight in 1402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.125 = fieldNorm(doc=1402)
        0.2 = coord(5/25)
    
  2. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.21
    0.20879447 = sum of:
      0.20879447 = product of:
        1.3049655 = sum of:
          0.03466521 = weight(abstract_txt:wird in 4582) [ClassicSimilarity], result of:
            0.03466521 = score(doc=4582,freq=2.0), product of:
              0.06885965 = queryWeight, product of:
                1.0081855 = boost
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.01798795 = queryNorm
              0.50341827 = fieldWeight in 4582, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.09375 = fieldNorm(doc=4582)
          0.026672777 = weight(abstract_txt:einer in 4582) [ClassicSimilarity], result of:
            0.026672777 = score(doc=4582,freq=1.0), product of:
              0.07284914 = queryWeight, product of:
                1.0369797 = boost
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.01798795 = queryNorm
              0.36613715 = fieldWeight in 4582, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.09375 = fieldNorm(doc=4582)
          0.20625536 = weight(abstract_txt:automatischen in 4582) [ClassicSimilarity], result of:
            0.20625536 = score(doc=4582,freq=2.0), product of:
              0.22610271 = queryWeight, product of:
                1.8268837 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.01798795 = queryNorm
              0.91221976 = fieldWeight in 4582, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.09375 = fieldNorm(doc=4582)
          1.0373721 = weight(abstract_txt:lingo in 4582) [ClassicSimilarity], result of:
            1.0373721 = score(doc=4582,freq=4.0), product of:
              0.6030395 = queryWeight, product of:
                3.6540673 = boost
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.01798795 = queryNorm
              1.7202392 = fieldWeight in 4582, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.09375 = fieldNorm(doc=4582)
        0.16 = coord(4/25)
    
  3. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.19
    0.19156054 = sum of:
      0.19156054 = product of:
        0.6841448 = sum of:
          0.022837711 = weight(abstract_txt:wird in 2055) [ClassicSimilarity], result of:
            0.022837711 = score(doc=2055,freq=5.0), product of:
              0.06885965 = queryWeight, product of:
                1.0081855 = boost
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.01798795 = queryNorm
              0.33165592 = fieldWeight in 2055, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.024850892 = weight(abstract_txt:einer in 2055) [ClassicSimilarity], result of:
            0.024850892 = score(doc=2055,freq=5.0), product of:
              0.07284914 = queryWeight, product of:
                1.0369797 = boost
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.01798795 = queryNorm
              0.34112814 = fieldWeight in 2055, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.905463 = idf(docFreq=2330, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.020567859 = weight(abstract_txt:durch in 2055) [ClassicSimilarity], result of:
            0.020567859 = score(doc=2055,freq=2.0), product of:
              0.08715704 = queryWeight, product of:
                1.1342512 = boost
                4.2718062 = idf(docFreq=1615, maxDocs=42596)
                0.01798795 = queryNorm
              0.23598619 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2718062 = idf(docFreq=1615, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.07903115 = weight(abstract_txt:indexierungssystem in 2055) [ClassicSimilarity], result of:
            0.07903115 = score(doc=2055,freq=1.0), product of:
              0.21381687 = queryWeight, product of:
                1.2562151 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.01798795 = queryNorm
              0.36962074 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.04416282 = weight(abstract_txt:ergebnisse in 2055) [ClassicSimilarity], result of:
            0.04416282 = score(doc=2055,freq=2.0), product of:
              0.14505999 = queryWeight, product of:
                1.4632949 = boost
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.01798795 = queryNorm
              0.30444524 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.060455926 = weight(abstract_txt:anhand in 2055) [ClassicSimilarity], result of:
            0.060455926 = score(doc=2055,freq=3.0), product of:
              0.1562327 = queryWeight, product of:
                1.518602 = boost
                5.7193446 = idf(docFreq=379, maxDocs=42596)
                0.01798795 = queryNorm
              0.38696077 = fieldWeight in 2055, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7193446 = idf(docFreq=379, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
          0.43223843 = weight(abstract_txt:lingo in 2055) [ClassicSimilarity], result of:
            0.43223843 = score(doc=2055,freq=4.0), product of:
              0.6030395 = queryWeight, product of:
                3.6540673 = boost
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.01798795 = queryNorm
              0.71676636 = fieldWeight in 2055, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.174609 = idf(docFreq=11, maxDocs=42596)
                0.0390625 = fieldNorm(doc=2055)
        0.28 = coord(7/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15438673 = sum of:
      0.15438673 = product of:
        0.5513812 = sum of:
          0.040027935 = weight(abstract_txt:wird in 5855) [ClassicSimilarity], result of:
            0.040027935 = score(doc=5855,freq=6.0), product of:
              0.06885965 = queryWeight, product of:
                1.0081855 = boost
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.01798795 = queryNorm
              0.5812974 = fieldWeight in 5855, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.08306554 = weight(abstract_txt:klassifizierung in 5855) [ClassicSimilarity], result of:
            0.08306554 = score(doc=5855,freq=1.0), product of:
              0.16157608 = queryWeight, product of:
                1.0920224 = boost
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.01798795 = queryNorm
              0.51409554 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.09879684 = weight(abstract_txt:klassifiziert in 5855) [ClassicSimilarity], result of:
            0.09879684 = score(doc=5855,freq=1.0), product of:
              0.18138102 = queryWeight, product of:
                1.1570148 = boost
                8.715076 = idf(docFreq=18, maxDocs=42596)
                0.01798795 = queryNorm
              0.5446923 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.715076 = idf(docFreq=18, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.12644984 = weight(abstract_txt:titeldatensätze in 5855) [ClassicSimilarity], result of:
            0.12644984 = score(doc=5855,freq=1.0), product of:
              0.21381687 = queryWeight, product of:
                1.2562151 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.01798795 = queryNorm
              0.5913932 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.04996453 = weight(abstract_txt:ergebnisse in 5855) [ClassicSimilarity], result of:
            0.04996453 = score(doc=5855,freq=1.0), product of:
              0.14505999 = queryWeight, product of:
                1.4632949 = boost
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.01798795 = queryNorm
              0.34444046 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5110474 = idf(docFreq=467, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.05584679 = weight(abstract_txt:anhand in 5855) [ClassicSimilarity], result of:
            0.05584679 = score(doc=5855,freq=1.0), product of:
              0.1562327 = queryWeight, product of:
                1.518602 = boost
                5.7193446 = idf(docFreq=379, maxDocs=42596)
                0.01798795 = queryNorm
              0.35745904 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7193446 = idf(docFreq=379, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
          0.097229704 = weight(abstract_txt:automatischen in 5855) [ClassicSimilarity], result of:
            0.097229704 = score(doc=5855,freq=1.0), product of:
              0.22610271 = queryWeight, product of:
                1.8268837 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.01798795 = queryNorm
              0.4300245 = fieldWeight in 5855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.0625 = fieldNorm(doc=5855)
        0.28 = coord(7/25)
    
  5. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.15
    0.15255992 = sum of:
      0.15255992 = product of:
        0.6356664 = sum of:
          0.012256002 = weight(abstract_txt:wird in 4464) [ClassicSimilarity], result of:
            0.012256002 = score(doc=4464,freq=1.0), product of:
              0.06885965 = queryWeight, product of:
                1.0081855 = boost
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.01798795 = queryNorm
              0.17798525 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7970185 = idf(docFreq=2597, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.13930514 = weight(abstract_txt:klassifizierung in 4464) [ClassicSimilarity], result of:
            0.13930514 = score(doc=4464,freq=5.0), product of:
              0.16157608 = queryWeight, product of:
                1.0920224 = boost
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.01798795 = queryNorm
              0.86216444 = fieldWeight in 4464, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.074097626 = weight(abstract_txt:klassifiziert in 4464) [ClassicSimilarity], result of:
            0.074097626 = score(doc=4464,freq=1.0), product of:
              0.18138102 = queryWeight, product of:
                1.1570148 = boost
                8.715076 = idf(docFreq=18, maxDocs=42596)
                0.01798795 = queryNorm
              0.4085192 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.715076 = idf(docFreq=18, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.094837375 = weight(abstract_txt:titeldatensätze in 4464) [ClassicSimilarity], result of:
            0.094837375 = score(doc=4464,freq=1.0), product of:
              0.21381687 = queryWeight, product of:
                1.2562151 = boost
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.01798795 = queryNorm
              0.44354486 = fieldWeight in 4464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.462291 = idf(docFreq=8, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.15211107 = weight(abstract_txt:titeldatensätzen in 4464) [ClassicSimilarity], result of:
            0.15211107 = score(doc=4464,freq=2.0), product of:
              0.23253383 = queryWeight, product of:
                1.3100446 = boost
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.01798795 = queryNorm
              0.65414596 = fieldWeight in 4464, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.867756 = idf(docFreq=5, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
          0.16305918 = weight(abstract_txt:automatischen in 4464) [ClassicSimilarity], result of:
            0.16305918 = score(doc=4464,freq=5.0), product of:
              0.22610271 = queryWeight, product of:
                1.8268837 = boost
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.01798795 = queryNorm
              0.72117305 = fieldWeight in 4464, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.880392 = idf(docFreq=118, maxDocs=42596)
                0.046875 = fieldNorm(doc=4464)
        0.24 = coord(6/25)