Hate Speech Detection - Groups

Hate Speech Tweets dataset

The Hate Speech Tweets dataset contains over 24,000 English tweets labeled as non-offensive, hate speech, and profanity.

Dataset
JSON

HateXplain

The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels.

Dataset
JSON

Latent Hatred: A Benchmark for Understanding Implicit Hate Speech

The implicit hate dataset is a specialized collection of data aimed at detecting hate speech.

Dataset
JSON

Hatexplain: A Benchmark Dataset for Explainable Hate Speech Detection

The HateXplain dataset is a benchmark dataset for explainable hate speech detection.

Dataset
JSON

Hate Speech Detection using Large Language Models

The dataset used for probing LLMs for hate speech detection, including HateXplain, implicit hate, and ToxicSpans datasets.

Dataset
JSON

Bengali Hate Speech Dataset

The Bengali Hate Speech Dataset is a large-scale dataset for hate speech detection in the Bengali language. It contains 8,087 labelled examples, categorized into political,...

Dataset
JSON

Multimodal Hate Speech Detection in Bengali

Multimodal hate speech detection dataset for Bengali language

Dataset
JSON

DOLaH

A dataset containing 2,026 Facebook posts collected from Twitter, labeled as offensive or non-offensive.

Dataset
JSON

The Hateful Memes dataset

The Hateful Memes dataset aims to help develop models that more eﬀectively detect multimodal hateful content.

Dataset
JSON

Hate Speech Detection Dataset

The dataset used in the paper is a collection of tweets with hate speech and offensive language, annotated with their sentiment.

Dataset
JSON

Hateful Memes Challenge

The Hateful Memes dataset is a multimodal dataset containing 10,000+ new examples of multimodal content.

Dataset
JSON

HuggingFace DLab dataset

The HuggingFace DLab dataset is used for assessing fair target-group detection. It contains 135,556 posts with explicit annotations for the target group(s).

Dataset
JSON

Hateful Memes Challenge Dataset

The dataset used for detecting harmful memes, particularly in the multicultural and multilingual context of Singapore.

Dataset
JSON

33 datasets found