Hate Speech Detection - Groups

Harassment (HAR) dataset

The dataset used for hate speech detection on Twitter
- Dataset
- JSON
HATE dataset

The dataset used for hate speech detection on Twitter
- Dataset
- JSON
Sexist/Racist (SR) dataset

The dataset used for hate speech detection on Twitter
- Dataset
- JSON
Hate Speech Detection on Twitter

The dataset used for hate speech detection on Twitter
- Dataset
- JSON
FRENK Dataset

The FRENK Dataset is a collection of Slovene and English comments annotated for hate speech.
- Dataset
- JSON
Wikipedia Hate Speech Dataset

The Wikipedia Hate Speech Dataset is a collection of user comments annotated for hate speech.
- Dataset
- JSON
Reddit Hate Detection Dataset

The Reddit Hate Detection Dataset is a collection of Reddit comments annotated for hate speech.
- Dataset
- JSON
The Gab Hate Corpus

The Gab Hate Corpus is a collection of 27k posts annotated for hate speech.
- Dataset
- JSON
PEACE: Cross-Platform Hate Speech Detection

The PEACE dataset is a collection of social media posts and comments annotated for hate speech.
- Dataset
- JSON
MUTE

The MUTE dataset is a multimodal dataset for detecting hateful memes.
- Dataset
- JSON
Automated hate speech detection and the problem of offensive language

Automated hate speech detection and the problem of offensive language.
- Dataset
- JSON
Tweet-BLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related M...

Tweet-BLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter.
- Dataset
- JSON
Statistical Analysis of Perspective Scores on Hate Speech Detection

Hate speech detection has become a hot topic in recent years due to the exponential growth of offensive language in social media.
- Dataset
- JSON
Dynamic Dataset

The Dynamic dataset, a collection of tweets generated using a human-and-model-in-the-loop process, containing challenging perturbations.
- Dataset
- JSON
Waseem Dataset

The Waseem dataset, a collection of tweets containing hate speech, with a focus on sexist and racist language.
- Dataset
- JSON
COVID-HATE

The dataset contains tweets expressing anti-Asian hate and countering hate speeches to support Asian ethnicity amidst COVID-19.
- Dataset
- JSON
PAN Profiling Hate Speech Spreader Task

The PAN Profiling Hate Speech Spreader Task contains a dataset in English and Spanish, whose samples were collected from Twitter.
- Dataset
- JSON
Korean Online Hate Speech Dataset for Multilabel Classification

Korean Online Hate Speech Dataset for Multilabel Classification
- Dataset
- JSON
Hateful Memes Dataset

The Hateful Memes Dataset consists of a training set of 8500 images, a dev set of 500 images & a test set of 1000 images. The meme text is present on the images, but also...
- Dataset
- JSON
CREHate

CREHate is a cross-cultural English hate speech dataset comprising 1,580 posts from five English-speaking countries—AU, GB, SG, US, and ZA.
- Dataset
- JSON

33 datasets found