-
Harassment (HAR) dataset
The dataset used for hate speech detection on Twitter -
HATE dataset
The dataset used for hate speech detection on Twitter -
Sexist/Racist (SR) dataset
The dataset used for hate speech detection on Twitter -
Hate Speech Detection on Twitter
The dataset used for hate speech detection on Twitter -
FRENK Dataset
The FRENK Dataset is a collection of Slovene and English comments annotated for hate speech. -
Wikipedia Hate Speech Dataset
The Wikipedia Hate Speech Dataset is a collection of user comments annotated for hate speech. -
Reddit Hate Detection Dataset
The Reddit Hate Detection Dataset is a collection of Reddit comments annotated for hate speech. -
The Gab Hate Corpus
The Gab Hate Corpus is a collection of 27k posts annotated for hate speech. -
PEACE: Cross-Platform Hate Speech Detection
The PEACE dataset is a collection of social media posts and comments annotated for hate speech. -
COVID-HATE
The dataset contains tweets expressing anti-Asian hate and countering hate speeches to support Asian ethnicity amidst COVID-19. -
PAN Profiling Hate Speech Spreader Task
The PAN Profiling Hate Speech Spreader Task contains a dataset in English and Spanish, whose samples were collected from Twitter. -
HateXplain
The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels. -
Bengali Hate Speech Dataset
The Bengali Hate Speech Dataset is a large-scale dataset for hate speech detection in the Bengali language. It contains 8,087 labelled examples, categorized into political,... -
Multimodal Hate Speech Detection in Bengali
Multimodal hate speech detection dataset for Bengali language -
Hate Speech Detection Dataset
The dataset used in the paper is a collection of tweets with hate speech and offensive language, annotated with their sentiment. -
Twitter Hate Speech Dataset
A large-scale dataset of tweets, retweets, user activity history, and follower networks, comprising over 161 million tweets from more than 41 million unique users. -
HuggingFace DLab dataset
The HuggingFace DLab dataset is used for assessing fair target-group detection. It contains 135,556 posts with explicit annotations for the target group(s). -
Human-machine collaboration approaches to build a dialogue dataset for hate s...
Human-machine collaboration approaches to build a dialogue dataset for hate speech countering