-
text models library dataset
A dataset of multi-year and multi-language preprocessed and aggregated data obtained from people using social networks. -
text models library
A library and an extensive dataset of multi-year and multi-language preprocessed and aggregated data obtained from people using social networks. -
coinform250
The dataset used for the coinform250 task, containing tweets with annotated labels. -
COVID-19 Twitter Data
The COVID-19 Twitter Data dataset contains tweets about the COVID-19 pandemic. -
Hashtags are (not) judgemental: The untold story of Lok Sabha elections 2019
The dataset contains over 24 million hashtags collected from Twitter during the 2019 Lok Sabha elections in India. -
Twitter User Demographics Dataset
This dataset includes the identifiers of sampled Twitter users, labeled by means of crowd sourcing with respect to the personal attributes of age, gender, ethnicity, family... -
TQ+ datasets for whataboutism detection
Two new datasets for whataboutism detection -
Reddit comments with Big 5 personality facet scores
The dataset used in the paper is Reddit comments with self-reported Big 5 personality facet scores. -
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Pers...
Human annotation plays a core role in machine learning — annotations for supervised models, safety guardrails for generative models, and human feedback for reinforcement... -
News and Social Media Articles Dataset
A dataset of annotated news and social media articles, spanning various aspects and media. -
Kaggle M5 Competition virtual community dataset
The Kaggle M5 Competition virtual community dataset contains discussion posts, comments, and user-generated content from the M5 competition.