Document (#37124)

Author
Jersek, T.
Title
Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse
Imprint
Köln : Fachhochschule
Year
2012
Pages
V. 56 S
Abstract
Die Arbeit befasst sich mit der Realisierung und der Durchführung einer automatischen DDCKlassifizierung durch das Indexierungssystem Lingo. Dies geschieht durch die Einbeziehung von Relationen des DFG-Projektes CrissCross, anhand derer Lingo bibliographische Titeldatensätze automatisch klassifiziert. Der dabei verwendete Ansatz wird mit dem üblichen methodischen Vorgehen bei automatischen Klassifizierungssystemen verglichen. Das Klassifizierungsverfahren wird daraufhin anhand einer Testkollektion von bibliographischen Titeldatensätzen der Deutschen Nationalbibliothek (DNB) getestet. Es folgt eine Diskussion der Ergebnisse und eine Bewertung des Klassifizierungssystems.
Content
Diplomarbeit, Studiengang Bibliothekswesen, Fakultät für Informations- und Kommunikationswissenschaften, Fachhochschule Köln.
Theme
Automatisches Klassifizieren
Object
Lingo
DDC

Similar documents (content)

  1. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.23
    0.22853027 = sum of:
      0.22853027 = product of:
        1.1426513 = sum of:
          0.05006157 = weight(abstract_txt:einer in 2402) [ClassicSimilarity], result of:
            0.05006157 = score(doc=2402,freq=2.0), product of:
              0.07261551 = queryWeight, product of:
                1.0372329 = boost
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.017951597 = queryNorm
              0.689406 = fieldWeight in 2402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.125 = fieldNorm(doc=2402)
          0.06563702 = weight(abstract_txt:durch in 2402) [ClassicSimilarity], result of:
            0.06563702 = score(doc=2402,freq=2.0), product of:
              0.086987935 = queryWeight, product of:
                1.1352489 = boost
                4.2683973 = idf(docFreq=1626, maxDocs=42740)
                0.017951597 = queryNorm
              0.75455314 = fieldWeight in 2402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2683973 = idf(docFreq=1626, maxDocs=42740)
                0.125 = fieldNorm(doc=2402)
          0.1408776 = weight(abstract_txt:ergebnisse in 2402) [ClassicSimilarity], result of:
            0.1408776 = score(doc=2402,freq=2.0), product of:
              0.14473973 = queryWeight, product of:
                1.4643856 = boost
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.017951597 = queryNorm
              0.9733168 = fieldWeight in 2402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.125 = fieldNorm(doc=2402)
          0.1939693 = weight(abstract_txt:automatischen in 2402) [ClassicSimilarity], result of:
            0.1939693 = score(doc=2402,freq=1.0), product of:
              0.22569664 = queryWeight, product of:
                1.8286228 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.017951597 = queryNorm
              0.8594248 = fieldWeight in 2402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.125 = fieldNorm(doc=2402)
          0.6921058 = weight(abstract_txt:lingo in 2402) [ClassicSimilarity], result of:
            0.6921058 = score(doc=2402,freq=1.0), product of:
              0.6032748 = queryWeight, product of:
                3.6615489 = boost
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.017951597 = queryNorm
              1.147248 = fieldWeight in 2402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.125 = fieldNorm(doc=2402)
        0.2 = coord(5/25)
    
  2. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.21
    0.20879501 = sum of:
      0.20879501 = product of:
        1.3049688 = sum of:
          0.034525372 = weight(abstract_txt:wird in 4582) [ClassicSimilarity], result of:
            0.034525372 = score(doc=4582,freq=2.0), product of:
              0.068666436 = queryWeight, product of:
                1.0086346 = boost
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.017951597 = queryNorm
              0.5027984 = fieldWeight in 4582, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.09375 = fieldNorm(doc=4582)
          0.026549157 = weight(abstract_txt:einer in 4582) [ClassicSimilarity], result of:
            0.026549157 = score(doc=4582,freq=1.0), product of:
              0.07261551 = queryWeight, product of:
                1.0372329 = boost
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.017951597 = queryNorm
              0.36561275 = fieldWeight in 4582, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.09375 = fieldNorm(doc=4582)
          0.2057355 = weight(abstract_txt:automatischen in 4582) [ClassicSimilarity], result of:
            0.2057355 = score(doc=4582,freq=2.0), product of:
              0.22569664 = queryWeight, product of:
                1.8286228 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.017951597 = queryNorm
              0.9115577 = fieldWeight in 4582, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.09375 = fieldNorm(doc=4582)
          1.0381588 = weight(abstract_txt:lingo in 4582) [ClassicSimilarity], result of:
            1.0381588 = score(doc=4582,freq=4.0), product of:
              0.6032748 = queryWeight, product of:
                3.6615489 = boost
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.017951597 = queryNorm
              1.720872 = fieldWeight in 4582, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.09375 = fieldNorm(doc=4582)
        0.16 = coord(4/25)
    
  3. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.19
    0.19153325 = sum of:
      0.19153325 = product of:
        0.68404734 = sum of:
          0.022745583 = weight(abstract_txt:wird in 3055) [ClassicSimilarity], result of:
            0.022745583 = score(doc=3055,freq=5.0), product of:
              0.068666436 = queryWeight, product of:
                1.0086346 = boost
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.017951597 = queryNorm
              0.33124748 = fieldWeight in 3055, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.024735719 = weight(abstract_txt:einer in 3055) [ClassicSimilarity], result of:
            0.024735719 = score(doc=3055,freq=5.0), product of:
              0.07261551 = queryWeight, product of:
                1.0372329 = boost
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.017951597 = queryNorm
              0.3406396 = fieldWeight in 3055, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.8998692 = idf(docFreq=2351, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.020511568 = weight(abstract_txt:durch in 3055) [ClassicSimilarity], result of:
            0.020511568 = score(doc=3055,freq=2.0), product of:
              0.086987935 = queryWeight, product of:
                1.1352489 = boost
                4.2683973 = idf(docFreq=1626, maxDocs=42740)
                0.017951597 = queryNorm
              0.23579785 = fieldWeight in 3055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2683973 = idf(docFreq=1626, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.07908841 = weight(abstract_txt:indexierungssystem in 3055) [ClassicSimilarity], result of:
            0.07908841 = score(doc=3055,freq=1.0), product of:
              0.2138955 = queryWeight, product of:
                1.258773 = boost
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.017951597 = queryNorm
              0.3697526 = fieldWeight in 3055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.04402425 = weight(abstract_txt:ergebnisse in 3055) [ClassicSimilarity], result of:
            0.04402425 = score(doc=3055,freq=2.0), product of:
              0.14473973 = queryWeight, product of:
                1.4643856 = boost
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.017951597 = queryNorm
              0.3041615 = fieldWeight in 3055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.060375616 = weight(abstract_txt:anhand in 3055) [ClassicSimilarity], result of:
            0.060375616 = score(doc=3055,freq=3.0), product of:
              0.15607634 = queryWeight, product of:
                1.520653 = boost
                5.7174697 = idf(docFreq=381, maxDocs=42740)
                0.017951597 = queryNorm
              0.38683388 = fieldWeight in 3055, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7174697 = idf(docFreq=381, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
          0.43256617 = weight(abstract_txt:lingo in 3055) [ClassicSimilarity], result of:
            0.43256617 = score(doc=3055,freq=4.0), product of:
              0.6032748 = queryWeight, product of:
                3.6615489 = boost
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.017951597 = queryNorm
              0.71703005 = fieldWeight in 3055, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.0390625 = fieldNorm(doc=3055)
        0.28 = coord(7/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15427706 = sum of:
      0.15427706 = product of:
        0.5509895 = sum of:
          0.039866466 = weight(abstract_txt:wird in 855) [ClassicSimilarity], result of:
            0.039866466 = score(doc=855,freq=6.0), product of:
              0.068666436 = queryWeight, product of:
                1.0086346 = boost
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.017951597 = queryNorm
              0.58058155 = fieldWeight in 855, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.0831391 = weight(abstract_txt:klassifizierung in 855) [ClassicSimilarity], result of:
            0.0831391 = score(doc=855,freq=1.0), product of:
              0.16165283 = queryWeight, product of:
                1.0943047 = boost
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.017951597 = queryNorm
              0.5143065 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.09887749 = weight(abstract_txt:klassifiziert in 855) [ClassicSimilarity], result of:
            0.09887749 = score(doc=855,freq=1.0), product of:
              0.18145882 = queryWeight, product of:
                1.1594062 = boost
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.017951597 = queryNorm
              0.5449032 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.12654145 = weight(abstract_txt:titeldatensätze in 855) [ClassicSimilarity], result of:
            0.12654145 = score(doc=855,freq=1.0), product of:
              0.2138955 = queryWeight, product of:
                1.258773 = boost
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.017951597 = queryNorm
              0.5916041 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.049807757 = weight(abstract_txt:ergebnisse in 855) [ClassicSimilarity], result of:
            0.049807757 = score(doc=855,freq=1.0), product of:
              0.14473973 = queryWeight, product of:
                1.4643856 = boost
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.017951597 = queryNorm
              0.34411946 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5059114 = idf(docFreq=471, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.05577261 = weight(abstract_txt:anhand in 855) [ClassicSimilarity], result of:
            0.05577261 = score(doc=855,freq=1.0), product of:
              0.15607634 = queryWeight, product of:
                1.520653 = boost
                5.7174697 = idf(docFreq=381, maxDocs=42740)
                0.017951597 = queryNorm
              0.35734186 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7174697 = idf(docFreq=381, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
          0.09698465 = weight(abstract_txt:automatischen in 855) [ClassicSimilarity], result of:
            0.09698465 = score(doc=855,freq=1.0), product of:
              0.22569664 = queryWeight, product of:
                1.8286228 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.017951597 = queryNorm
              0.4297124 = fieldWeight in 855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.0625 = fieldNorm(doc=855)
        0.28 = coord(7/25)
    
  5. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.15
    0.15253489 = sum of:
      0.15253489 = product of:
        0.63556206 = sum of:
          0.012206561 = weight(abstract_txt:wird in 285) [ClassicSimilarity], result of:
            0.012206561 = score(doc=285,freq=1.0), product of:
              0.068666436 = queryWeight, product of:
                1.0086346 = boost
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.017951597 = queryNorm
              0.17776605 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7923427 = idf(docFreq=2618, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
          0.13942851 = weight(abstract_txt:klassifizierung in 285) [ClassicSimilarity], result of:
            0.13942851 = score(doc=285,freq=5.0), product of:
              0.16165283 = queryWeight, product of:
                1.0943047 = boost
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.017951597 = queryNorm
              0.8625182 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
          0.07415812 = weight(abstract_txt:klassifiziert in 285) [ClassicSimilarity], result of:
            0.07415812 = score(doc=285,freq=1.0), product of:
              0.18145882 = queryWeight, product of:
                1.1594062 = boost
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.017951597 = queryNorm
              0.4086774 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.7184515 = idf(docFreq=18, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
          0.09490609 = weight(abstract_txt:titeldatensätze in 285) [ClassicSimilarity], result of:
            0.09490609 = score(doc=285,freq=1.0), product of:
              0.2138955 = queryWeight, product of:
                1.258773 = boost
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.017951597 = queryNorm
              0.4437031 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.465666 = idf(docFreq=8, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
          0.15221462 = weight(abstract_txt:titeldatensätzen in 285) [ClassicSimilarity], result of:
            0.15221462 = score(doc=285,freq=2.0), product of:
              0.23261257 = queryWeight, product of:
                1.312693 = boost
                9.871131 = idf(docFreq=5, maxDocs=42740)
                0.017951597 = queryNorm
              0.6543697 = fieldWeight in 285, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.871131 = idf(docFreq=5, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
          0.1626482 = weight(abstract_txt:automatischen in 285) [ClassicSimilarity], result of:
            0.1626482 = score(doc=285,freq=5.0), product of:
              0.22569664 = queryWeight, product of:
                1.8286228 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.017951597 = queryNorm
              0.72064966 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.046875 = fieldNorm(doc=285)
        0.24 = coord(6/25)