-
Connected Behavior
The dataset used for user-level stance detection, comprising over 100 million tweets. -
Online Media Monitor (OMM) dataset
The Online Media Monitor (OMM) from the University of Hamburg contributed with a dataset of 5,236,660 unlabeled tweets gathered from June 21, 2022, to December 8, 2022. -
Twitter Climate Change Sentiment Dataset
The Twitter Climate Change Sentiment Dataset contains labelled tweets pertaining to climate change, covering the time period from April 27, 2015, to February 21, 2018. -
Waseem and Hovy (2016) dataset
A dataset of 16,000 tweets, of which 3,383 tweets were sexist. -
GeoUK 2022 Tweets Dataset
A dataset of geolocated tweets in 2022, filtered to keep only tweets in the UK. -
Weather Tweets Dataset
A manually labelled dataset of 124,360 weather tweets collected by Asiaee T. et al. (2012) as part of the "Dialogue Earth" project. -
SMM4H18_Test
The dataset consists of tweets posted by 212 Twitter users during and after their pregnancy. -
SMM4H18_Val
The dataset consists of tweets posted by 212 Twitter users during and after their pregnancy. -
SMM4H18_Train
The dataset consists of tweets posted by 212 Twitter users during and after their pregnancy. -
BioCreative_TrainTask3.1
The dataset consists of tweets posted by 212 Twitter users during and after their pregnancy. -
BioCreative_ValTask3
The dataset consists of tweets posted by 212 Twitter users during and after their pregnancy. -
BioCreative_TrainTask3.0
The dataset consists of all tweets posted by 212 Twitter users during and after their pregnancy. -
Dynamic Dataset
The Dynamic dataset, a collection of tweets generated using a human-and-model-in-the-loop process, containing challenging perturbations. -
Waseem Dataset
The Waseem dataset, a collection of tweets containing hate speech, with a focus on sexist and racist language. -
BioCreative VII Track 3 - Medication Detection in Tweets (2018)
The 2018 dataset used in the BioCreative VII Track 3 - automatic extraction of medication names in tweets. -
BioCreative VII Track 3 - Medication Detection in Tweets
The dataset used in the BioCreative VII Track 3 - automatic extraction of medication names in tweets. -
Stock Movement and Volatility Prediction from Tweets, Macroeconomic Factors a...
The dataset used in the paper for stock movement and volatility prediction from tweets, macroeconomic factors and historical prices. -
Twitter Data Dataset
The dataset used in this paper is a collection of Twitter data, including tweets, retweets, and replies. -
Twitter Social Media Dataset
The dataset used in this paper is a collection of social media data from Twitter, including user profiles, follow links, and tweets. -
Stream TwitterSentiment
Stream TwitterSentiment is a dataset of tweets, focusing on sentiment analysis, and is used to test the performance of active stream learning algorithms for polarity learning.