-
P-Stance, SemEval-2016, and MTSD datasets
Stance detection dataset for political stance detection, including P-Stance, SemEval-2016, and MTSD datasets. -
Movie Reviews
The Movie Reviews dataset contains positive and negative movie reviews with rationales annotated by humans to support classification. -
DBpedia Animals
The DBpedia Animals dataset comprises 10,000 English Wikipedia article abstracts for animal species. -
Rcv1: A new benchmark collection for text categorization research
Rcv1: A new benchmark collection for text categorization research. -
AGNews, 20News, NYT, IMDB
AGNews, 20News, NYT, IMDB are datasets used for weakly supervised text classification. -
HateXplain
The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels. -
CLIMABENCH
CLIMABENCH is a benchmark of climate-related text classification tasks. It collates five existing climate change-related text datasets, including CLIMATEXT, CLIMATESTANCE,... -
NeurIPS dataset
The NeurIPS dataset is a collection of 7241 papers published in NeurIPS from 1987 to 2016. -
Wikipedia dataset
The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries... -
Reuters Dataset
The Reuters dataset is a text classification dataset containing 21,578 samples. -
Text Classification Datasets
The dataset used in the paper is a collection of adversarial examples and natural examples for text classification tasks. -
Shakespeare dataset
Mobile crowdsensing has gained significant attention in recent years and has become a critical paradigm for emerging Internet of Things applications. The sensing devices... -
Reuters21578
The problem of similarity search is to find the most similar items in a large collection to a query item of interest. Fast similarity search is at the core of many information...