Audio Classification - Groups

VGGSound

The VGGSound dataset is a large-scale audio-visual dataset containing 10,000 10-second video clips with corresponding audio files.

Dataset
JSON

Speech Commands Dataset

The dataset used for training the keyword spotting model is the ESC: Dataset for Environmental Sound Classification, and the Speech Commands Dataset.

Dataset
JSON

Speech Commands

The Speech Commands dataset consists of 105809 one-second audio recordings of 35 spoken words sampled at 16kHz. The raw speech commands dataset presents audio recordings as a...

Dataset
JSON

ESC-50

The dataset used for training the CNN in cough detection is composed of various modified audio clips gathered from open-source online sources. Each of these audio files...

Dataset
JSON

4 datasets found

VGGSound

Speech Commands Dataset

Speech Commands

ESC-50