2 datasets found

Groups: Text Analysis Organizations: No Organization Formats: JSON

Filter Results
  • Wikipedia Corpus

    The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
  • BookCorpus

    The dataset used in this paper for unsupervised sentence representation learning, consisting of paragraphs from unlabeled text.