-
Umsuka English-isiZulu Parallel Corpus
The Umsuka English-isiZulu Parallel Corpus provides a novel, high-quality parallel dataset for machine translation, containing English sentences sampled from both News Crawl... -
MADAR dataset
The MADAR dataset is a parallel corpus for low-resource languages.