-
GTZAN dataset
The GTZAN dataset is a small but popular dataset for genre classification, containing 10 musical genres, with each genre having 100 audio snippets of 30 s length. -
Google Speech Command Dataset
The Google Speech Command Dataset is a dataset for keyword spotting, which is a task in speech recognition. The dataset contains 12 classes, including 10 keywords and two extra... -
Speech Commands Dataset
The dataset used for training the keyword spotting model is the ESC: Dataset for Environmental Sound Classification, and the Speech Commands Dataset. -
VoxCeleb: A Large-Scale Speaker Identification Dataset
VoxCeleb: A Large-Scale Speaker Identification Dataset -
Librispeech
The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.