Search (6 results, page 1 of 1)

Radford, A.; Wu, J.; Child, R.; Luan, D.; Amode, D.; Sutskever, I.: Language models are unsupervised multitask learners 0.03
```
0.031724695 = product of:
  0.06344939 = sum of:
    0.06344939 = product of:
      0.12689878 = sum of:
        0.12689878 = weight(_text_:plus in 871) [ClassicSimilarity], result of:
          0.12689878 = score(doc=871,freq=2.0), product of:
            0.3101809 = queryWeight, product of:
              6.1714344 = idf(docFreq=250, maxDocs=44218)
              0.05026075 = queryNorm
            0.40911216 = fieldWeight in 871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.1714344 = idf(docFreq=250, maxDocs=44218)
              0.046875 = fieldNorm(doc=871)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.02042891 = product of:
  0.04085782 = sum of:
    0.04085782 = product of:
      0.08171564 = sum of:
        0.08171564 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
          0.08171564 = score(doc=4888,freq=2.0), product of:
            0.17600457 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05026075 = queryNorm
            0.46428138 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 3.2013 14:56:22

Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.01

0.013619275 = product of:
  0.02723855 = sum of:
    0.02723855 = product of:
      0.0544771 = sum of:
        0.0544771 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
          0.0544771 = score(doc=1490,freq=2.0), product of:
            0.17600457 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05026075 = queryNorm
            0.30952093 = fieldWeight in 1490, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1490)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2015 9:30:24

Bager, J.: ¬Die Text-KI ChatGPT schreibt Fachtexte, Prosa, Gedichte und Programmcode (2023) 0.01

0.013619275 = product of:
  0.02723855 = sum of:
    0.02723855 = product of:
      0.0544771 = sum of:
        0.0544771 = weight(_text_:22 in 835) [ClassicSimilarity], result of:
          0.0544771 = score(doc=835,freq=2.0), product of:
            0.17600457 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05026075 = queryNorm
            0.30952093 = fieldWeight in 835, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=835)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 29.12.2022 18:22:55

Rieger, F.: Lügende Computer (2023) 0.01

0.013619275 = product of:
  0.02723855 = sum of:
    0.02723855 = product of:
      0.0544771 = sum of:
        0.0544771 = weight(_text_:22 in 912) [ClassicSimilarity], result of:
          0.0544771 = score(doc=912,freq=2.0), product of:
            0.17600457 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05026075 = queryNorm
            0.30952093 = fieldWeight in 912, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=912)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 16. 3.2023 19:22:55

Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.01

0.0068096374 = product of:
  0.013619275 = sum of:
    0.013619275 = product of:
      0.02723855 = sum of:
        0.02723855 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
          0.02723855 = score(doc=4217,freq=2.0), product of:
            0.17600457 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05026075 = queryNorm
            0.15476047 = fieldWeight in 4217, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=4217)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2018 11:32:44

Search (6 results, page 1 of 1)

Authors

Years

Languages