Document (#43801)

Author
Schaer, P.
Title
Sprachmodelle und neuronale Netze im Information Retrieval
Source
Grundlagen der Informationswissenschaft. Hrsg.: Rainer Kuhlen, Dirk Lewandowski, Wolfgang Semar und Christa Womser-Hacker. 7., völlig neu gefasste Ausg
Imprint
Berlin : DeGruyter
Year
2023
Pages
S.455-466
Abstract
In den letzten Jahren haben Sprachmodelltechnologien unterschiedlichster Ausprägungen in der Informationswissenschaft Einzug gehalten. Diesen Sprachmodellen, die unter den Bezeichnungen GPT, ELMo oder BERT bekannt sind, ist gemein, dass sie dank sehr großer Webkorpora auf eine Datenbasis zurückgreifen, die bei vorherigen Sprachmodellansätzen undenkbar war. Gleichzeitig setzen diese Modelle auf neuere Entwicklungen des maschinellen Lernens, insbesondere auf künstliche neuronale Netze. Diese Technologien haben auch im Information Retrieval (IR) Fuß gefasst und bereits kurz nach ihrer Einführung sprunghafte, substantielle Leistungssteigerungen erzielt. Neuronale Netze haben in Kombination mit großen vortrainierten Sprachmodellen und kontextualisierten Worteinbettungen geführt. Wurde in vergangenen Jahren immer wieder eine stagnierende Retrievalleistung beklagt, die Leistungssteigerungen nur gegenüber "schwachen Baselines" aufwies, so konnten mit diesen technischen und methodischen Innovationen beeindruckende Leistungssteigerungen in Aufgaben wie dem klassischen Ad-hoc-Retrieval, der maschinellen Übersetzung oder auch dem Question Answering erzielt werden. In diesem Kapitel soll ein kurzer Überblick über die Grundlagen der Sprachmodelle und der NN gegeben werden, um die prinzipiellen Bausteine zu verstehen, die hinter aktuellen Technologien wie ELMo oder BERT stecken, die die Welt des NLP und IR im Moment beherrschen.
Footnote
Vgl.: https://doi.org/10.1515/9783110769043.
Theme
Computerlinguistik
Object
GPT-3

Similar documents (author)

  1. Schaer, P.: Integration von Open-Access-Repositorien in Fachportale (2010) 5.58
    5.5776863 = sum of:
      5.5776863 = weight(author_txt:schaer in 2320) [ClassicSimilarity], result of:
        5.5776863 = fieldWeight in 2320, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.924298 = idf(docFreq=15, maxDocs=44218)
          0.625 = fieldNorm(doc=2320)
    
  2. Munkelt, J.; Schaer, P.: Towards an IR test collection for the German National Library (2018) 4.46
    4.462149 = sum of:
      4.462149 = weight(author_txt:schaer in 5780) [ClassicSimilarity], result of:
        4.462149 = fieldWeight in 5780, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.924298 = idf(docFreq=15, maxDocs=44218)
          0.5 = fieldNorm(doc=5780)
    
  3. Mayr, P.; Schaer, P.; Mutschke, P.: ¬A science model driven retrieval prototype (2011) 3.35
    3.346612 = sum of:
      3.346612 = weight(author_txt:schaer in 649) [ClassicSimilarity], result of:
        3.346612 = fieldWeight in 649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.924298 = idf(docFreq=15, maxDocs=44218)
          0.375 = fieldNorm(doc=649)
    
  4. Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 3.35
    3.346612 = sum of:
      3.346612 = weight(author_txt:schaer in 3895) [ClassicSimilarity], result of:
        3.346612 = fieldWeight in 3895, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.924298 = idf(docFreq=15, maxDocs=44218)
          0.375 = fieldNorm(doc=3895)
    
  5. Munkelt, J.; Schaer, P.; Lepsky, K.: Towards an IR test collection for the German National Library (2018) 3.35
    3.346612 = sum of:
      3.346612 = weight(author_txt:schaer in 4311) [ClassicSimilarity], result of:
        3.346612 = fieldWeight in 4311, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.924298 = idf(docFreq=15, maxDocs=44218)
          0.375 = fieldNorm(doc=4311)
    

Similar documents (content)

  1. Matt, A.; Schaber, E.; Violet, B.: Vielfältige Formate und dynamische Umsetzung : Mathematik-Kommunikation zu Künstlicher Intelligenz bei IMAGINARY (2023) 0.15
    0.14551711 = sum of:
      0.14551711 = product of:
        0.72758555 = sum of:
          0.029089635 = weight(abstract_txt:diese in 891) [ClassicSimilarity], result of:
            0.029089635 = score(doc=891,freq=1.0), product of:
              0.07156928 = queryWeight, product of:
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.016507689 = queryNorm
              0.4064542 = fieldWeight in 891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.09375 = fieldNorm(doc=891)
          0.040876325 = weight(abstract_txt:oder in 891) [ClassicSimilarity], result of:
            0.040876325 = score(doc=891,freq=1.0), product of:
              0.10278098 = queryWeight, product of:
                1.4677048 = boost
                4.2421675 = idf(docFreq=1727, maxDocs=44218)
                0.016507689 = queryNorm
              0.3977032 = fieldWeight in 891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2421675 = idf(docFreq=1727, maxDocs=44218)
                0.09375 = fieldNorm(doc=891)
          0.1538528 = weight(abstract_txt:maschinellen in 891) [ClassicSimilarity], result of:
            0.1538528 = score(doc=891,freq=1.0), product of:
              0.21725582 = queryWeight, product of:
                1.7422978 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.016507689 = queryNorm
              0.7081643 = fieldWeight in 891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.09375 = fieldNorm(doc=891)
          0.20850289 = weight(abstract_txt:netze in 891) [ClassicSimilarity], result of:
            0.20850289 = score(doc=891,freq=1.0), product of:
              0.30456004 = queryWeight, product of:
                2.526497 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.016507689 = queryNorm
              0.6846036 = fieldWeight in 891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.09375 = fieldNorm(doc=891)
          0.2952639 = weight(abstract_txt:neuronale in 891) [ClassicSimilarity], result of:
            0.2952639 = score(doc=891,freq=1.0), product of:
              0.38406533 = queryWeight, product of:
                2.8371668 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.016507689 = queryNorm
              0.7687856 = fieldWeight in 891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.09375 = fieldNorm(doc=891)
        0.2 = coord(5/25)
    
  2. Bischoff, M.: KI lernt die Sprache der Mathematik (2020) 0.11
    0.11160747 = sum of:
      0.11160747 = product of:
        0.9300623 = sum of:
          0.09045103 = weight(abstract_txt:haben in 5904) [ClassicSimilarity], result of:
            0.09045103 = score(doc=5904,freq=1.0), product of:
              0.12415801 = queryWeight, product of:
                1.6131312 = boost
                4.6624994 = idf(docFreq=1134, maxDocs=44218)
                0.016507689 = queryNorm
              0.7285155 = fieldWeight in 5904, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6624994 = idf(docFreq=1134, maxDocs=44218)
                0.15625 = fieldNorm(doc=5904)
          0.34750482 = weight(abstract_txt:netze in 5904) [ClassicSimilarity], result of:
            0.34750482 = score(doc=5904,freq=1.0), product of:
              0.30456004 = queryWeight, product of:
                2.526497 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.016507689 = queryNorm
              1.141006 = fieldWeight in 5904, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.15625 = fieldNorm(doc=5904)
          0.49210647 = weight(abstract_txt:neuronale in 5904) [ClassicSimilarity], result of:
            0.49210647 = score(doc=5904,freq=1.0), product of:
              0.38406533 = queryWeight, product of:
                2.8371668 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.016507689 = queryNorm
              1.2813092 = fieldWeight in 5904, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.15625 = fieldNorm(doc=5904)
        0.12 = coord(3/25)
    
  3. Angerer, C.: Neuronale Netze : Revolution für die Wissenschaft? (2018) 0.11
    0.11040078 = sum of:
      0.11040078 = product of:
        0.9200065 = sum of:
          0.08039518 = weight(abstract_txt:jahren in 4023) [ClassicSimilarity], result of:
            0.08039518 = score(doc=4023,freq=1.0), product of:
              0.10026637 = queryWeight, product of:
                1.1836256 = boost
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.016507689 = queryNorm
              0.801816 = fieldWeight in 4023, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.15625 = fieldNorm(doc=4023)
          0.34750482 = weight(abstract_txt:netze in 4023) [ClassicSimilarity], result of:
            0.34750482 = score(doc=4023,freq=1.0), product of:
              0.30456004 = queryWeight, product of:
                2.526497 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.016507689 = queryNorm
              1.141006 = fieldWeight in 4023, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.15625 = fieldNorm(doc=4023)
          0.49210647 = weight(abstract_txt:neuronale in 4023) [ClassicSimilarity], result of:
            0.49210647 = score(doc=4023,freq=1.0), product of:
              0.38406533 = queryWeight, product of:
                2.8371668 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.016507689 = queryNorm
              1.2813092 = fieldWeight in 4023, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.15625 = fieldNorm(doc=4023)
        0.12 = coord(3/25)
    
  4. Lämmel, U.; Cleve, J.: Künstliche Intelligenz : mit 50 Tabellen, 43 Beispielen, 208 Aufgaben, 89 Kontrollfragen und Referatsthemen (2008) 0.11
    0.105277896 = sum of:
      0.105277896 = product of:
        0.6579869 = sum of:
          0.016968954 = weight(abstract_txt:diese in 642) [ClassicSimilarity], result of:
            0.016968954 = score(doc=642,freq=1.0), product of:
              0.07156928 = queryWeight, product of:
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.016507689 = queryNorm
              0.23709829 = fieldWeight in 642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.0546875 = fieldNorm(doc=642)
          0.044770982 = weight(abstract_txt:haben in 642) [ClassicSimilarity], result of:
            0.044770982 = score(doc=642,freq=2.0), product of:
              0.12415801 = queryWeight, product of:
                1.6131312 = boost
                4.6624994 = idf(docFreq=1134, maxDocs=44218)
                0.016507689 = queryNorm
              0.3605968 = fieldWeight in 642, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6624994 = idf(docFreq=1134, maxDocs=44218)
                0.0546875 = fieldNorm(doc=642)
          0.29792333 = weight(abstract_txt:netze in 642) [ClassicSimilarity], result of:
            0.29792333 = score(doc=642,freq=6.0), product of:
              0.30456004 = queryWeight, product of:
                2.526497 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.016507689 = queryNorm
              0.9782089 = fieldWeight in 642, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.0546875 = fieldNorm(doc=642)
          0.29832366 = weight(abstract_txt:neuronale in 642) [ClassicSimilarity], result of:
            0.29832366 = score(doc=642,freq=3.0), product of:
              0.38406533 = queryWeight, product of:
                2.8371668 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.016507689 = queryNorm
              0.7767524 = fieldWeight in 642, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0546875 = fieldNorm(doc=642)
        0.16 = coord(4/25)
    
  5. Assfalg, R.: Metadaten (2023) 0.09
    0.09285868 = sum of:
      0.09285868 = product of:
        0.4642934 = sum of:
          0.05878218 = weight(abstract_txt:diese in 787) [ClassicSimilarity], result of:
            0.05878218 = score(doc=787,freq=3.0), product of:
              0.07156928 = queryWeight, product of:
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.016507689 = queryNorm
              0.8213326 = fieldWeight in 787, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3355117 = idf(docFreq=1573, maxDocs=44218)
                0.109375 = fieldNorm(doc=787)
          0.13717175 = weight(abstract_txt:ausprägungen in 787) [ClassicSimilarity], result of:
            0.13717175 = score(doc=787,freq=1.0), product of:
              0.14413509 = queryWeight, product of:
                1.003475 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.016507689 = queryNorm
              0.95168877 = fieldWeight in 787, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.109375 = fieldNorm(doc=787)
          0.14799814 = weight(abstract_txt:gefasst in 787) [ClassicSimilarity], result of:
            0.14799814 = score(doc=787,freq=1.0), product of:
              0.15162265 = queryWeight, product of:
                1.0292094 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.016507689 = queryNorm
              0.97609514 = fieldWeight in 787, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.109375 = fieldNorm(doc=787)
          0.07265227 = weight(abstract_txt:diesen in 787) [ClassicSimilarity], result of:
            0.07265227 = score(doc=787,freq=1.0), product of:
              0.11887834 = queryWeight, product of:
                1.2888075 = boost
                5.58764 = idf(docFreq=449, maxDocs=44218)
                0.016507689 = queryNorm
              0.6111481 = fieldWeight in 787, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.58764 = idf(docFreq=449, maxDocs=44218)
                0.109375 = fieldNorm(doc=787)
          0.047689047 = weight(abstract_txt:oder in 787) [ClassicSimilarity], result of:
            0.047689047 = score(doc=787,freq=1.0), product of:
              0.10278098 = queryWeight, product of:
                1.4677048 = boost
                4.2421675 = idf(docFreq=1727, maxDocs=44218)
                0.016507689 = queryNorm
              0.46398705 = fieldWeight in 787, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2421675 = idf(docFreq=1727, maxDocs=44218)
                0.109375 = fieldNorm(doc=787)
        0.2 = coord(5/25)