Search (1 results, page 1 of 1)

Shree, P.: ¬The journey of Open AI GPT models (2020) 0.01
```
0.00908145 = product of:
  0.0181629 = sum of:
    0.0181629 = product of:
      0.0363258 = sum of:
        0.0363258 = weight(_text_:1 in 869) [ClassicSimilarity], result of:
          0.0363258 = score(doc=869,freq=6.0), product of:
            0.12878966 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.052428056 = queryNorm
            0.28205526 = fieldWeight in 869, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=869)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Generative Pre-trained Transformer (GPT) models by OpenAI have taken natural language processing (NLP) community by storm by introducing very powerful language models. These models can perform various NLP tasks like question answering, textual entailment, text summarisation etc. without any supervised training. These language models need very few to no examples to understand the tasks and perform equivalent or even better than the state-of-the-art models trained in supervised fashion. In this article we will cover the journey of these models and understand how they have evolved over a period of 2 years. 1. Discussion of GPT-1 paper (Improving Language Understanding by Generative Pre-training). 2. Discussion of GPT-2 paper (Language Models are unsupervised multitask learners) and its subsequent improvements over GPT-1. 3. Discussion of GPT-3 paper (Language models are few shot learners) and the improvements which have made it one of the most powerful models NLP has seen till date. This article assumes familiarity with the basics of NLP terminologies and transformer architecture.