-
Wikitext-103 and LAMBADA datasets
The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors trained a GPT2 transformer language model on the Wikitext-103 and LAMBADA datasets. -
WikiText-103 dataset
The dataset used in this paper is the WikiText-103 dataset, which contains a large corpus of text.