-
Toxic Comment Classification Challenge dataset
The Toxic Comment Classification Challenge dataset contains comments from Wikipedia organized in six classes: toxic, severe toxic, obscene, threat, insult, and identity hate. -
Hate Speech Tweets dataset
The Hate Speech Tweets dataset contains over 24,000 English tweets labeled as non-offensive, hate speech, and profanity. -
Classification of Research Citations (CRC)
A dataset of 150 research papers from the domain of computer science, manually annotated and class labelled for sentiment analysis. -
Stream TwitterSentiment
Stream TwitterSentiment is a dataset of tweets, focusing on sentiment analysis, and is used to test the performance of active stream learning algorithms for polarity learning. -
Hatespeech
The Hatespeech dataset is a collection of tweets containing lexicons used in hate speech. -
IMDB and Yelp datasets
IMDB and Yelp are datasets used for sentiment analysis and author identification. -
Entity-Specific Sentiment Classification of Yahoo News Comments
The dataset is used for entity-specific sentiment classification of Yahoo News comments. -
Sentiment Training Dataset
Sentiment training dataset for LABDet, a robust and language-agnostic bias probing method to quantify intrinsic bias in monolingual PLMs. -
Tweet Sentiment Extraction
The Tweet Sentiment Extraction dataset contains positive, negative, and neutral tweets with human-annotated rationales. -
Movie Reviews
The Movie Reviews dataset contains positive and negative movie reviews with rationales annotated by humans to support classification. -
MoodyLyricsPN
MoodyLyricsPN is a bigger collection of 5000 songs labeled as positive or negative only. -
MoodyLyrics4Q
MoodyLyrics4Q is a dataset of 2,000 songs, fully compliant with the four requisites listed in the previous section. -
Sentiment-oriented Transformer-based Variational Autoencoder Network for Live...
Sentiment-oriented Transformer-based Variational Autoencoder (So-TVAE) for Live Video Commenting -
New York Times and 20Newsgroups datasets
The dataset used in the paper is the New York Times dataset and the 20Newsgroups dataset.