-
FHM Dataset
The FHM dataset, a multimodal framework for detecting hateful memes on social media. -
Dynamic Dataset
The Dynamic dataset, a collection of tweets generated using a human-and-model-in-the-loop process, containing challenging perturbations. -
Waseem Dataset
The Waseem dataset, a collection of tweets containing hate speech, with a focus on sexist and racist language. -
Hate Speech Tweets dataset
The Hate Speech Tweets dataset contains over 24,000 English tweets labeled as non-offensive, hate speech, and profanity. -
HateXplain
The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels. -
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
The implicit hate dataset is a specialized collection of data aimed at detecting hate speech. -
Hatexplain: A Benchmark Dataset for Explainable Hate Speech Detection
The HateXplain dataset is a benchmark dataset for explainable hate speech detection. -
Hate Speech Detection using Large Language Models
The dataset used for probing LLMs for hate speech detection, including HateXplain, implicit hate, and ToxicSpans datasets. -
The Hateful Memes dataset
The Hateful Memes dataset aims to help develop models that more effectively detect multimodal hateful content. -
Hateful Memes Challenge
The Hateful Memes dataset is a multimodal dataset containing 10,000+ new examples of multimodal content.