Multilingual Offensive Language Identification Dataset (OLID)

The dataset is a multilingual offensive language identification dataset for social media, containing posts from Arabic, Danish, English, Greek, and Turkish.

Data and Resources

Cite this as

Tharindu Ranasinghe, Hansi Hettiarachchi (2024). Dataset: Multilingual Offensive Language Identification Dataset (OLID). https://doi.org/10.57702/d91znhbo

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2010.06278
Author Tharindu Ranasinghe
More Authors
Hansi Hettiarachchi
Homepage https://github.com/zampieri/multilingual-offensive-language-identification