Document (#36874)

Author
Bredack, J.
Lepsky, K.
Title
Automatische Extraktion von Fachterminologie aus Volltexten
Source
ABI-Technik. 34(2014) H.1, S.2-8
Year
2014
Abstract
Fachterminologie in wissenschaftlichen Texten liegt häufig in Form von Phrasen oder Mehrwortgruppen vor. Vorgestellt wird ein algorithmisches Verfahren zur Identifikation und Extraktion fachtermi­nologischer Mehrwortgruppen. Besonderer Schwerpunkt ist die Einbindung von Funktionswörtern der deutschen Sprache, um die Extraktion komplexer Mehrwortkonstruktionen zu ermöglichen. Eingesetzt wurde das automatische Indexierungssystem Lingo. Die Ergebnisse für eine Extraktion kunsthistorischer Fachterminologie aus dem Reallexikon zur Deutschen Kunstgeschichte belegen die Tauglichkeit des Verfahrens.
Theme
Automatisches Indexieren
Field
Kunst
Object
Lingo
RDK

Similar documents (author)

  1. Lepsky, K.: Art and language : Ernst H. Gombrich and Karl Bühler's theory of language (1996) 5.06
    5.0570784 = sum of:
      5.0570784 = weight(author_txt:lepsky in 5229) [ClassicSimilarity], result of:
        5.0570784 = fieldWeight in 5229, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.091326 = idf(docFreq=35, maxDocs=43254)
          0.625 = fieldNorm(doc=5229)
    
  2. Lepsky, K.: Maschinelle Indexierung von Titelaufnahmen zur Verbesserung der sachlichen Erschließung in Online-Publikumskatalogen (1994) 5.06
    5.0570784 = sum of:
      5.0570784 = weight(author_txt:lepsky in 64) [ClassicSimilarity], result of:
        5.0570784 = fieldWeight in 64, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.091326 = idf(docFreq=35, maxDocs=43254)
          0.625 = fieldNorm(doc=64)
    
  3. Lepsky, K.: RSWK - und was noch? : Stellungnahme zum Bericht 'Sacherschließung in Online-Katalogen' der Expertengruppe Online-Kataloge (1995) 5.06
    5.0570784 = sum of:
      5.0570784 = weight(author_txt:lepsky in 1841) [ClassicSimilarity], result of:
        5.0570784 = fieldWeight in 1841, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.091326 = idf(docFreq=35, maxDocs=43254)
          0.625 = fieldNorm(doc=1841)
    
  4. Lepsky, K.: Bild und Wirklichkeit : die Wirklichkeit im Bild (1987) 5.06
    5.0570784 = sum of:
      5.0570784 = weight(author_txt:lepsky in 2415) [ClassicSimilarity], result of:
        5.0570784 = fieldWeight in 2415, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.091326 = idf(docFreq=35, maxDocs=43254)
          0.625 = fieldNorm(doc=2415)
    
  5. Lepsky, K.: Ernst H. Gombrich : Theorie und Methode (1991) 5.06
    5.0570784 = sum of:
      5.0570784 = weight(author_txt:lepsky in 2754) [ClassicSimilarity], result of:
        5.0570784 = fieldWeight in 2754, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.091326 = idf(docFreq=35, maxDocs=43254)
          0.625 = fieldNorm(doc=2754)
    

Similar documents (content)

  1. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.61
    0.60832596 = sum of:
      0.60832596 = product of:
        1.382559 = sum of:
          0.028008265 = weight(abstract_txt:verfahren in 2519) [ClassicSimilarity], result of:
            0.028008265 = score(doc=2519,freq=4.0), product of:
              0.06205001 = queryWeight, product of:
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.010739585 = queryNorm
              0.45138213 = fieldWeight in 2519, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.027421743 = weight(abstract_txt:sprache in 2519) [ClassicSimilarity], result of:
            0.027421743 = score(doc=2519,freq=3.0), product of:
              0.06733807 = queryWeight, product of:
                1.0417402 = boost
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.010739585 = queryNorm
              0.40722495 = fieldWeight in 2519, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.02787793 = weight(abstract_txt:texten in 2519) [ClassicSimilarity], result of:
            0.02787793 = score(doc=2519,freq=1.0), product of:
              0.09819244 = queryWeight, product of:
                1.2579637 = boost
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.010739585 = queryNorm
              0.28391117 = fieldWeight in 2519, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.054075375 = weight(abstract_txt:einbindung in 2519) [ClassicSimilarity], result of:
            0.054075375 = score(doc=2519,freq=3.0), product of:
              0.10589212 = queryWeight, product of:
                1.306354 = boost
                7.5477104 = idf(docFreq=61, maxDocs=43254)
                0.010739585 = queryNorm
              0.51066476 = fieldWeight in 2519, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5477104 = idf(docFreq=61, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.03928472 = weight(abstract_txt:verfahrens in 2519) [ClassicSimilarity], result of:
            0.03928472 = score(doc=2519,freq=1.0), product of:
              0.123420365 = queryWeight, product of:
                1.4103357 = boost
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.010739585 = queryNorm
              0.31830016 = fieldWeight in 2519, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.112709634 = weight(abstract_txt:lingo in 2519) [ClassicSimilarity], result of:
            0.112709634 = score(doc=2519,freq=4.0), product of:
              0.15698509 = queryWeight, product of:
                1.5905901 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.010739585 = queryNorm
              0.71796393 = fieldWeight in 2519, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.061814606 = weight(abstract_txt:indexierungssystem in 2519) [ClassicSimilarity], result of:
            0.061814606 = score(doc=2519,freq=1.0), product of:
              0.16696744 = queryWeight, product of:
                1.6403818 = boost
                9.47762 = idf(docFreq=8, maxDocs=43254)
                0.010739585 = queryNorm
              0.37021953 = fieldWeight in 2519, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.47762 = idf(docFreq=8, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.027977074 = weight(abstract_txt:deutschen in 2519) [ClassicSimilarity], result of:
            0.027977074 = score(doc=2519,freq=2.0), product of:
              0.098425105 = queryWeight, product of:
                1.7811357 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.010739585 = queryNorm
              0.28424734 = fieldWeight in 2519, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.5054434 = weight(abstract_txt:mehrwortgruppen in 2519) [ClassicSimilarity], result of:
            0.5054434 = score(doc=2519,freq=13.0), product of:
              0.3631184 = queryWeight, product of:
                3.4211192 = boost
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.010739585 = queryNorm
              1.391952 = fieldWeight in 2519, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.16906445 = weight(abstract_txt:fachterminologie in 2519) [ClassicSimilarity], result of:
            0.16906445 = score(doc=2519,freq=1.0), product of:
              0.47095525 = queryWeight, product of:
                4.7717705 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.010739585 = queryNorm
              0.35898197 = fieldWeight in 2519, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
          0.3288817 = weight(abstract_txt:extraktion in 2519) [ClassicSimilarity], result of:
            0.3288817 = score(doc=2519,freq=3.0), product of:
              0.5600719 = queryWeight, product of:
                6.008706 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.010739585 = queryNorm
              0.5872134 = fieldWeight in 2519, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0390625 = fieldNorm(doc=2519)
        0.44 = coord(11/25)
    
  2. Grün, S.: Bildung von Komposita-Indextermen auf der Basis einer algorithmischen Mehrwortgruppenanalyse mit Lingo (2015) 0.22
    0.22488761 = sum of:
      0.22488761 = product of:
        1.124438 = sum of:
          0.031663902 = weight(abstract_txt:sprache in 3336) [ClassicSimilarity], result of:
            0.031663902 = score(doc=3336,freq=1.0), product of:
              0.06733807 = queryWeight, product of:
                1.0417402 = boost
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.010739585 = queryNorm
              0.4702229 = fieldWeight in 3336, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.078125 = fieldNorm(doc=3336)
          0.112709634 = weight(abstract_txt:lingo in 3336) [ClassicSimilarity], result of:
            0.112709634 = score(doc=3336,freq=1.0), product of:
              0.15698509 = queryWeight, product of:
                1.5905901 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.010739585 = queryNorm
              0.71796393 = fieldWeight in 3336, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.078125 = fieldNorm(doc=3336)
          0.039565556 = weight(abstract_txt:deutschen in 3336) [ClassicSimilarity], result of:
            0.039565556 = score(doc=3336,freq=1.0), product of:
              0.098425105 = queryWeight, product of:
                1.7811357 = boost
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.010739585 = queryNorm
              0.40198642 = fieldWeight in 3336, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1454263 = idf(docFreq=684, maxDocs=43254)
                0.078125 = fieldNorm(doc=3336)
          0.5607391 = weight(abstract_txt:mehrwortgruppen in 3336) [ClassicSimilarity], result of:
            0.5607391 = score(doc=3336,freq=4.0), product of:
              0.3631184 = queryWeight, product of:
                3.4211192 = boost
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.010739585 = queryNorm
              1.5442321 = fieldWeight in 3336, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.078125 = fieldNorm(doc=3336)
          0.3797599 = weight(abstract_txt:extraktion in 3336) [ClassicSimilarity], result of:
            0.3797599 = score(doc=3336,freq=1.0), product of:
              0.5600719 = queryWeight, product of:
                6.008706 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.010739585 = queryNorm
              0.67805564 = fieldWeight in 3336, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.078125 = fieldNorm(doc=3336)
        0.2 = coord(5/25)
    
  3. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.19
    0.1923772 = sum of:
      0.1923772 = product of:
        1.2023575 = sum of:
          0.08920937 = weight(abstract_txt:texten in 1866) [ClassicSimilarity], result of:
            0.08920937 = score(doc=1866,freq=1.0), product of:
              0.09819244 = queryWeight, product of:
                1.2579637 = boost
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.010739585 = queryNorm
              0.9085157 = fieldWeight in 1866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.125 = fieldNorm(doc=1866)
          0.18033542 = weight(abstract_txt:lingo in 1866) [ClassicSimilarity], result of:
            0.18033542 = score(doc=1866,freq=1.0), product of:
              0.15698509 = queryWeight, product of:
                1.5905901 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.010739585 = queryNorm
              1.1487423 = fieldWeight in 1866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.125 = fieldNorm(doc=1866)
          0.15582992 = weight(abstract_txt:automatische in 1866) [ClassicSimilarity], result of:
            0.15582992 = score(doc=1866,freq=1.0), product of:
              0.17943822 = queryWeight, product of:
                2.404925 = boost
                6.9474573 = idf(docFreq=112, maxDocs=43254)
                0.010739585 = queryNorm
              0.86843216 = fieldWeight in 1866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9474573 = idf(docFreq=112, maxDocs=43254)
                0.125 = fieldNorm(doc=1866)
          0.7769829 = weight(abstract_txt:mehrwortgruppen in 1866) [ClassicSimilarity], result of:
            0.7769829 = score(doc=1866,freq=3.0), product of:
              0.3631184 = queryWeight, product of:
                3.4211192 = boost
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.010739585 = queryNorm
              2.1397507 = fieldWeight in 1866, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.125 = fieldNorm(doc=1866)
        0.16 = coord(4/25)
    
  4. Witschel, H.F.: Text, Wörter, Morpheme : Möglichkeiten einer automatischen Terminologie-Extraktion (2004) 0.17
    0.1654297 = sum of:
      0.1654297 = product of:
        0.82714844 = sum of:
          0.038809393 = weight(abstract_txt:verfahren in 1591) [ClassicSimilarity], result of:
            0.038809393 = score(doc=1591,freq=3.0), product of:
              0.06205001 = queryWeight, product of:
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.010739585 = queryNorm
              0.6254534 = fieldWeight in 1591, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.0625 = fieldNorm(doc=1591)
          0.025331123 = weight(abstract_txt:sprache in 1591) [ClassicSimilarity], result of:
            0.025331123 = score(doc=1591,freq=1.0), product of:
              0.06733807 = queryWeight, product of:
                1.0417402 = boost
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.010739585 = queryNorm
              0.37617832 = fieldWeight in 1591, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.018853 = idf(docFreq=285, maxDocs=43254)
                0.0625 = fieldNorm(doc=1591)
          0.06285556 = weight(abstract_txt:verfahrens in 1591) [ClassicSimilarity], result of:
            0.06285556 = score(doc=1591,freq=1.0), product of:
              0.123420365 = queryWeight, product of:
                1.4103357 = boost
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.010739585 = queryNorm
              0.50928026 = fieldWeight in 1591, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.148484 = idf(docFreq=33, maxDocs=43254)
                0.0625 = fieldNorm(doc=1591)
          0.2705031 = weight(abstract_txt:fachterminologie in 1591) [ClassicSimilarity], result of:
            0.2705031 = score(doc=1591,freq=1.0), product of:
              0.47095525 = queryWeight, product of:
                4.7717705 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.010739585 = queryNorm
              0.57437116 = fieldWeight in 1591, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0625 = fieldNorm(doc=1591)
          0.4296493 = weight(abstract_txt:extraktion in 1591) [ClassicSimilarity], result of:
            0.4296493 = score(doc=1591,freq=2.0), product of:
              0.5600719 = queryWeight, product of:
                6.008706 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.010739585 = queryNorm
              0.7671324 = fieldWeight in 1591, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0625 = fieldNorm(doc=1591)
        0.2 = coord(5/25)
    
  5. Witschel, H.F.: Terminologie-Extraktion : Möglichkeiten der Kombination statistischer uns musterbasierter Verfahren (2004) 0.14
    0.14130613 = sum of:
      0.14130613 = product of:
        0.70653063 = sum of:
          0.022406613 = weight(abstract_txt:verfahren in 1588) [ClassicSimilarity], result of:
            0.022406613 = score(doc=1588,freq=1.0), product of:
              0.06205001 = queryWeight, product of:
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.010739585 = queryNorm
              0.3611057 = fieldWeight in 1588, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7776914 = idf(docFreq=363, maxDocs=43254)
                0.0625 = fieldNorm(doc=1588)
          0.03539358 = weight(abstract_txt:liegt in 1588) [ClassicSimilarity], result of:
            0.03539358 = score(doc=1588,freq=2.0), product of:
              0.06679809 = queryWeight, product of:
                1.037555 = boost
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.010739585 = queryNorm
              0.5298592 = fieldWeight in 1588, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.0625 = fieldNorm(doc=1588)
          0.044604685 = weight(abstract_txt:texten in 1588) [ClassicSimilarity], result of:
            0.044604685 = score(doc=1588,freq=1.0), product of:
              0.09819244 = queryWeight, product of:
                1.2579637 = boost
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.010739585 = queryNorm
              0.45425785 = fieldWeight in 1588, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2681255 = idf(docFreq=81, maxDocs=43254)
                0.0625 = fieldNorm(doc=1588)
          0.07791496 = weight(abstract_txt:automatische in 1588) [ClassicSimilarity], result of:
            0.07791496 = score(doc=1588,freq=1.0), product of:
              0.17943822 = queryWeight, product of:
                2.404925 = boost
                6.9474573 = idf(docFreq=112, maxDocs=43254)
                0.010739585 = queryNorm
              0.43421608 = fieldWeight in 1588, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9474573 = idf(docFreq=112, maxDocs=43254)
                0.0625 = fieldNorm(doc=1588)
          0.5262108 = weight(abstract_txt:extraktion in 1588) [ClassicSimilarity], result of:
            0.5262108 = score(doc=1588,freq=3.0), product of:
              0.5600719 = queryWeight, product of:
                6.008706 = boost
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.010739585 = queryNorm
              0.93954146 = fieldWeight in 1588, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.679112 = idf(docFreq=19, maxDocs=43254)
                0.0625 = fieldNorm(doc=1588)
        0.2 = coord(5/25)