-
SST-1, SST-2, Subj, TREC, CR, MPQA
The dataset used for the experiments is a set of common datasets for natural language processing. -
Yahoo and Yelp corpora
The Yahoo and Yelp corpora dataset contains 100k sentences with greater average length. -
Penn Treebank
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths.