-
Movie reviews and Aggressive messages corpus
The dataset is a corpus of movie reviews and anonymous imageboard messages annotated with consideration of containing or not state of aggression. -
Yelp Reviews Polarity
The Yelp Reviews Polarity dataset contains 560k and 38k (in training and dev portion respectively) customer reviews in English from Yelp. -
Semeval-2016 Task 6: Detecting stance in tweets
Semeval-2016 Task 6: Detecting stance in tweets. -
ClimateNLP: Analyzing Public Sentiment Towards Climate Change using NLP
The dataset used for sentiment analysis of climate change-related tweets. -
Rotten Tomatoes
The Rotten Tomatoes dataset has 5331 positive and 5331 negative review sentences. -
SST2, IMDB, Rotten Tomatoes
The SST2 dataset has 6920/872/1821 example sentences in the train/dev/test sets. The task is binary classification into positive/negative sentiment. The IMDB dataset has... -
Booking Labeled Dataset
The dataset used for training a text classifier to learn the polarity of hotel reviews. -
Memotion Dataset 7K
The Memotion Dataset 7K is a collection of 7000 memes with associated metadata. -
Sentiment Analysis Dataset
The dataset used in the paper is a collection of unstructured text data from social networks, news sites, and forums. -
Polarity dataset
The Polarity dataset contains text documents with sentiment labels. -
IMDb Review Dataset
The IMDb review dataset is used for positive generation task. -
Annotating expressions of opinions and emotions in language
Annotating expressions of opinions and emotions in language -
Jester Dataset
The Jester dataset contains joke ratings in a continuous scale from 1 to 10 for 100 jokes from a total of 73421 users. -
Sent140 dataset
The dataset used in the paper is a real-world dataset for sentiment analysis. -
Twitter Dataset
The Twitter Dataset is a collection of tweets annotated with Plutchik's emotions, consisting of tweets in three different languages: English, Dutch, and German. -
Classification of Research Citations (CRC)
A dataset of 150 research papers from the domain of computer science, manually annotated and class labelled for sentiment analysis. -
Stream TwitterSentiment
Stream TwitterSentiment is a dataset of tweets, focusing on sentiment analysis, and is used to test the performance of active stream learning algorithms for polarity learning. -
Hatespeech
The Hatespeech dataset is a collection of tweets containing lexicons used in hate speech.