-
Google Speech Command Dataset
The Google Speech Command Dataset is a dataset for keyword spotting, which is a task in speech recognition. The dataset contains 12 classes, including 10 keywords and two extra... -
wav2vec 2.0
The wav2vec 2.0 dataset is a self-supervised learning dataset for speech recognition tasks. -
Switchboard Corpus
The Switchboard corpus is a dataset of speech recordings from a switchboard, which is a device that allows multiple people to speak at the same time. -
Libri-Light
The dataset used in the paper is the Libri-Light dataset, which is a subset of the LibriSpeech dataset. The authors used this dataset to pre-train their proposed dual-mode ASR... -
Masked Acoustic Unit for Mispronunciation Detection and Correction
The proposed method uses the acoustic unit (AU) as the intermediary feature for both mispronunciation detection and correction. -
HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus
The HKUST dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence. -
The Wall Street Journal Corpus
The WSJ dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence. -
TIMIT Acoustic-Phonetic Continuous Speech Corpus
The TIMIT acoustic-phonetic continuous speech corpusCD-ROM contains a large collection of speech samples from 250 male and 250 female speakers. -
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
SpecAugment is a data augmentation method for automatic speech recognition, which masks the mel-spectrogram along the time and frequency axes. -
MIXSPEECH: DATA AUGMENTATION FOR LOW-RESOURCE AUTOMATIC SPEECH RECOGNITION
MixSpeech is a data augmentation method for automatic speech recognition, which trains an ASR model by taking a weighted combination of two different speech features as the... -
Speech Commands Dataset
The dataset used for training the keyword spotting model is the ESC: Dataset for Environmental Sound Classification, and the Speech Commands Dataset. -
Google Commands
This dataset has no description
-
Proprietary Speech Dataset
Proprietary speech dataset consisted of 184 hours of high quality US English speech spoken by 11 female and 10 male speakers. -
Speech Commands
The Speech Commands dataset consists of 105809 one-second audio recordings of 35 spoken words sampled at 16kHz. The raw speech commands dataset presents audio recordings as a... -
Switchboard dataset
The dataset used in the paper is the Switchboard dataset, which contains telephone conversations.