-
The WASABI dataset
A large dataset of songs containing lyrics and other metadata about roughly 2M of songs in 21 languages. -
MS COCO dataset
The MS COCO dataset is a large benchmark for image captioning, containing 328K images with 5 caption descriptions each.