Wikitext-103 and MusDB datasets
The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors trained a 16 layers transformer (Vaswani et al., 2017) based language model on the Wikitext-103 text corpus (Merity et al., 2016) and a Demucs source separation model (Défossez et al., 2019) on the MusDB dataset (Rafii et al., 2017)
BibTex: