-
Radford, A.; Wu, J.; Child, R.; Luan, D.; Amode, D.; Sutskever, I.: Language models are unsupervised multitask learners
0.03
0.031724695 = product of:
0.06344939 = sum of:
0.06344939 = product of:
0.12689878 = sum of:
0.12689878 = weight(_text_:plus in 871) [ClassicSimilarity], result of:
0.12689878 = score(doc=871,freq=2.0), product of:
0.3101809 = queryWeight, product of:
6.1714344 = idf(docFreq=250, maxDocs=44218)
0.05026075 = queryNorm
0.40911216 = fieldWeight in 871, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
6.1714344 = idf(docFreq=250, maxDocs=44218)
0.046875 = fieldNorm(doc=871)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
-
Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009)
0.02
0.02042891 = product of:
0.04085782 = sum of:
0.04085782 = product of:
0.08171564 = sum of:
0.08171564 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
0.08171564 = score(doc=4888,freq=2.0), product of:
0.17600457 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.05026075 = queryNorm
0.46428138 = fieldWeight in 4888, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=4888)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 1. 3.2013 14:56:22
-
Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013)
0.01
0.013619275 = product of:
0.02723855 = sum of:
0.02723855 = product of:
0.0544771 = sum of:
0.0544771 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
0.0544771 = score(doc=1490,freq=2.0), product of:
0.17600457 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.05026075 = queryNorm
0.30952093 = fieldWeight in 1490, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=1490)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 22. 3.2015 9:30:24
-
Bager, J.: ¬Die Text-KI ChatGPT schreibt Fachtexte, Prosa, Gedichte und Programmcode (2023)
0.01
0.013619275 = product of:
0.02723855 = sum of:
0.02723855 = product of:
0.0544771 = sum of:
0.0544771 = weight(_text_:22 in 835) [ClassicSimilarity], result of:
0.0544771 = score(doc=835,freq=2.0), product of:
0.17600457 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.05026075 = queryNorm
0.30952093 = fieldWeight in 835, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=835)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 29.12.2022 18:22:55
-
Rieger, F.: Lügende Computer (2023)
0.01
0.013619275 = product of:
0.02723855 = sum of:
0.02723855 = product of:
0.0544771 = sum of:
0.0544771 = weight(_text_:22 in 912) [ClassicSimilarity], result of:
0.0544771 = score(doc=912,freq=2.0), product of:
0.17600457 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.05026075 = queryNorm
0.30952093 = fieldWeight in 912, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=912)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 16. 3.2023 19:22:55
-
Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018)
0.01
0.0068096374 = product of:
0.013619275 = sum of:
0.013619275 = product of:
0.02723855 = sum of:
0.02723855 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
0.02723855 = score(doc=4217,freq=2.0), product of:
0.17600457 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.05026075 = queryNorm
0.15476047 = fieldWeight in 4217, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.03125 = fieldNorm(doc=4217)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 22. 1.2018 11:32:44