SST2, IMDB, Rotten Tomatoes

The SST2 dataset has 6920/872/1821 example sentences in the train/dev/test sets. The task is binary classification into positive/negative sentiment. The IMDB dataset has 25000/25000 example reviews in the train/test sets with similar binary labels for positive and negative sentiment. Similarly, the Rotten Tomatoes dataset has 5331 positive and 5331 negative review sentences.

BibTex: