-
FRENK Dataset
The FRENK Dataset is a collection of Slovene and English comments annotated for hate speech. -
Wikipedia Hate Speech Dataset
The Wikipedia Hate Speech Dataset is a collection of user comments annotated for hate speech. -
Reddit Hate Detection Dataset
The Reddit Hate Detection Dataset is a collection of Reddit comments annotated for hate speech. -
The Gab Hate Corpus
The Gab Hate Corpus is a collection of 27k posts annotated for hate speech. -
PEACE: Cross-Platform Hate Speech Detection
The PEACE dataset is a collection of social media posts and comments annotated for hate speech. -
Statistical Analysis of Perspective Scores on Hate Speech Detection
Hate speech detection has become a hot topic in recent years due to the exponential growth of offensive language in social media. -
HateXplain
The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels. -
HuggingFace DLab dataset
The HuggingFace DLab dataset is used for assessing fair target-group detection. It contains 135,556 posts with explicit annotations for the target group(s).