-
Corpus of Regional African American Language (CORAAL)
This dataset comprises more than 150 socio-linguistic interviews with African-American English speakers born between 1891 and 2005. -
Twitter AAE dataset
This dataset contains tweets classified as AAE or Mainstream American English (MAE). Tweets were classified using an ML model, and we consider a subset of tweets where the model...