-
NeurIPS dataset
The NeurIPS dataset is a collection of 7241 papers published in NeurIPS from 1987 to 2016. -
20Newsgroups dataset
The 20Newsgroups data set is a dataset of 18,846 instances of newsgroup documents. -
Yelp Dataset
The Yelp Dataset contains 1.6M reviews and 500K tips by 366K users for 61K businesses; 481K business attributes, such as hours, parking availability, ambience; and check-ins for... -
Yelp Dataset Challenge
The Yelp dataset challenge contains reviews and images of restaurants, with the goal of recommending images for each review. -
Penn Treebank
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths. -
BookCorpus
The dataset used in this paper for unsupervised sentence representation learning, consisting of paragraphs from unlabeled text.