-
Harassment (HAR) dataset
The dataset used for hate speech detection on Twitter -
HATE dataset
The dataset used for hate speech detection on Twitter -
Sexist/Racist (SR) dataset
The dataset used for hate speech detection on Twitter -
Hate Speech Detection on Twitter
The dataset used for hate speech detection on Twitter -
FRENK Dataset
The FRENK Dataset is a collection of Slovene and English comments annotated for hate speech. -
Wikipedia Hate Speech Dataset
The Wikipedia Hate Speech Dataset is a collection of user comments annotated for hate speech. -
Reddit Hate Detection Dataset
The Reddit Hate Detection Dataset is a collection of Reddit comments annotated for hate speech. -
The Gab Hate Corpus
The Gab Hate Corpus is a collection of 27k posts annotated for hate speech. -
PEACE: Cross-Platform Hate Speech Detection
The PEACE dataset is a collection of social media posts and comments annotated for hate speech. -
Automated hate speech detection and the problem of offensive language
Automated hate speech detection and the problem of offensive language. -
Tweet-BLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related M...
Tweet-BLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter. -
Statistical Analysis of Perspective Scores on Hate Speech Detection
Hate speech detection has become a hot topic in recent years due to the exponential growth of offensive language in social media. -
Dynamic Dataset
The Dynamic dataset, a collection of tweets generated using a human-and-model-in-the-loop process, containing challenging perturbations. -
Waseem Dataset
The Waseem dataset, a collection of tweets containing hate speech, with a focus on sexist and racist language. -
COVID-HATE
The dataset contains tweets expressing anti-Asian hate and countering hate speeches to support Asian ethnicity amidst COVID-19. -
PAN Profiling Hate Speech Spreader Task
The PAN Profiling Hate Speech Spreader Task contains a dataset in English and Spanish, whose samples were collected from Twitter. -
Korean Online Hate Speech Dataset for Multilabel Classification
Korean Online Hate Speech Dataset for Multilabel Classification -
Hateful Memes Dataset
The Hateful Memes Dataset consists of a training set of 8500 images, a dev set of 500 images & a test set of 1000 images. The meme text is present on the images, but also...