-
Wikitext-103
The dataset used in this paper is Wikitext-103, a general English language corpus containing good and featured Wikipedia articles. -
SeqDiffuSeq
The dataset used in the SeqDiffuSeq paper for sequence-to-sequence text generation. -
BookCorpus
The dataset used in this paper for unsupervised sentence representation learning, consisting of paragraphs from unlabeled text. -
PatentEval Dataset
The PatentEval dataset is a comprehensive dataset for evaluating patent text generation. -
WikiText-103 dataset
The dataset used in this paper is the WikiText-103 dataset, which contains a large corpus of text.