Social Media Analysis - Groups

text models library dataset

A dataset of multi-year and multi-language preprocessed and aggregated data obtained from people using social networks.

Dataset
JSON

text models library

A library and an extensive dataset of multi-year and multi-language preprocessed and aggregated data obtained from people using social networks.

Dataset
JSON

coinform250

The dataset used for the coinform250 task, containing tweets with annotated labels.

Dataset
JSON

COVID-19 Twitter Data

The COVID-19 Twitter Data dataset contains tweets about the COVID-19 pandemic.

Dataset
JSON

Hashtags are (not) judgemental: The untold story of Lok Sabha elections 2019

The dataset contains over 24 million hashtags collected from Twitter during the 2019 Lok Sabha elections in India.

Dataset
JSON

Twitter User Demographics Dataset

This dataset includes the identifiers of sampled Twitter users, labeled by means of crowd sourcing with respect to the personal attributes of age, gender, ethnicity, family...

Dataset
JSON

TQ+ datasets for whataboutism detection

Two new datasets for whataboutism detection

Dataset
JSON

Reddit comments with Big 5 personality facet scores

The dataset used in the paper is Reddit comments with self-reported Big 5 personality facet scores.

Dataset
JSON

D3

The D3 dataset contains a curated sample of social media posts from Jigsaw datasets (Jigsaw, 2019, 2018), annotated for offensiveness in text.

Dataset
JSON

GRASP: A Disagreement Analysis Framework to Assess Group Associations in Pers...

Human annotation plays a core role in machine learning — annotations for supervised models, safety guardrails for generative models, and human feedback for reinforcement...

Dataset
JSON

News and Social Media Articles Dataset

A dataset of annotated news and social media articles, spanning various aspects and media.

Dataset
JSON

Kaggle M5 Competition virtual community dataset

The Kaggle M5 Competition virtual community dataset contains discussion posts, comments, and user-generated content from the M5 competition.

Dataset
JSON

12 datasets found