Speech - Groups

MUSAN: A Music, Speech, and Noise Corpus

MUSAN is a Music, Speech, and Noise Corpus.

Dataset
JSON

TIMIT

The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...

Dataset
JSON

DAC

The dataset used in this paper is a speech dataset, which is used for training and testing the proposed LaDiffCodec model.

Dataset
JSON

EnCodec

The dataset used in this paper is a speech dataset, which is used for training and testing the proposed LaDiffCodec model.

Dataset
JSON

VCTK Corpus

The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.

Dataset
JSON

Librispeech

The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.

Dataset
JSON

LibriLight

The dataset used in this paper is a large-scale production ASR system, which includes multi-domain (MD) data sets in English. The MD data sets include medium-form (MF) and...

Dataset
JSON

7 datasets found