2 datasets found

Tags: Text Corpus

Filter Results
  • Wall Street Journal corpus

    The Wall Street Journal corpus (wsj), WikiText-103 (wiki), and dev split of Librispeech (lib-dev) are used.
  • BookCorpus

    The dataset used in this paper for unsupervised sentence representation learning, consisting of paragraphs from unlabeled text.
You can also access this registry using the API (see API Docs).