Audio Processing - Groups

AudioMNIST dataset

The dataset used in the paper is the AudioMNIST dataset, which contains 30,000 audio recordings.

Dataset
JSON

AVA-Speech

The AVA-Speech dataset is a publicly available dataset of movies densely labeled with speech activity.

Dataset
JSON

A Hybrid CNN-BiLSTM VOICE ACTIVITY DETECTOR

A hybrid CNN-BiLSTM VOICE ACTIVITY DETECTOR for voice activity detection (VAD) incorporating both convolutional neural network (CNN) and bidirectional long short-term memory...

Dataset
JSON

Librispeech

The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.

Dataset
JSON

LibriLight

The dataset used in this paper is a large-scale production ASR system, which includes multi-domain (MD) data sets in English. The MD data sets include medium-form (MF) and...

Dataset
JSON

Google Speech Commands Dataset Version II

The Google Speech Commands Dataset Version II contains 105,829 utterances of 35 words from 2,618 speakers with a sampling rate of 16 kHz.

Dataset
JSON