-
AG's News Corpus
AG's News Corpus -
Rcv1: A new benchmark collection for text categorization research
Rcv1: A new benchmark collection for text categorization research. -
AGNews, 20News, NYT, IMDB
AGNews, 20News, NYT, IMDB are datasets used for weakly supervised text classification. -
HateXplain
The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels. -
CLIMABENCH
CLIMABENCH is a benchmark of climate-related text classification tasks. It collates five existing climate change-related text datasets, including CLIMATEXT, CLIMATESTANCE,... -
NeurIPS dataset
The NeurIPS dataset is a collection of 7241 papers published in NeurIPS from 1987 to 2016. -
Wikipedia dataset
The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries... -
IMDB Document
The dataset used in the paper is a collection of text sequences for text classification tasks. -
Yelp 2014 Document
The dataset used in the paper is a collection of text sequences for text classification tasks. -
Yelp 2013 Document
The dataset used in the paper is a collection of text sequences for text classification tasks. -
Yelp Review Dataset
The Yelp review dataset contains hotel and restaurant reviews filtered (spam) and recommended (legitimate) by Yelp. -
20NG Dataset
The 20NG dataset is a text classification dataset containing 20 categories. -
Ohsumed Dataset
The Ohsumed dataset is a text classification dataset containing 3,357 documents. -
Reuters Dataset
The Reuters dataset is a text classification dataset containing 21,578 samples. -
Text Classification Dataset
The dataset used for text classification, which is a variant of the typical text classification model based on convolutional operation and max-pooling layer.