-
Kaldi Speech Recognition Toolkit
The Kaldi Speech Recognition Toolkit is a widely used dataset for speech recognition. -
WAV2LETTER++
The dataset used in this paper is not explicitly mentioned, but it is implied to be a speech recognition dataset. -
Corpus of Spoken Dutch
The Corpus of Spoken Dutch (CGN) is a dataset of spoken Dutch recordings. -
Language Models of Spoken Dutch
The dataset consists of subtitles of television shows provided by the Flemish public-service broadcaster VRT. The dataset is used to train language models of spoken Dutch. -
GigaSpeech
GigaSpeech: An evolving, multi-domain ASR corpus with 10,000 hours of transcribed audio. -
FASTINJECT: Injecting Unpaired Text Data into CTC-Based ASR Training
This paper proposes a flat-start joint training method, named FastInject, to inject unpaired text data into CTC-based ASR training. -
TED-LIUM 3
TED-LIUM 3 (TL3) is a TED talks dataset. Speaker adaptation data for TL3 was divided randomly, where 2/5 was divided into the train set, 1/5 was divided into the dev set, and... -
Speaker Anonymization using X-Vector and Neural Waveform Models
Speaker anonymization using x-vector and neural waveform models. -
NIST RT-03 English CTS
The dataset is used for speaker diarization tasks. -
HYPOTHESIS STITCHER FOR END-TO-END SPEAKER-ATTRIBUTED ASR ON LONG-FORM MULTI-...
An end-to-end (E2E) speaker-attributed automatic speech recognition (SA-ASR) model was proposed recently to jointly perform speaker counting, speech recognition and speaker... -
BD-4SK-ASR
The dataset used in this paper is BD-4SK-ASR, an experimental dataset which is used in the first attempt in developing an ASR system for Sorani Kurdish. -
VoiceHome-2
The dataset used in this paper is VoiceHome-2, an extended corpus for multichannel speech processing in real homes. -
ASRU 2019 Mandarin-English code-switching speech recognition challenge
The ASRU 2019 Mandarin-English code-switching speech recognition challenge dataset. -
Wall Street Journal
The Wall Street Journal dataset is used for syntactic linearization. It contains a large corpus of news articles with their corresponding syntactic trees. -
Video Corpus
A corpus of free and representative video content was gathered. This corpus includes videos having progressive scanning, 1280x720 resolution, and framerates between 24-30 frames... -
Correction Focused Language Model Training for Speech Recognition
Language models have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM... -
AudioMNIST dataset
The dataset used in the paper is the AudioMNIST dataset, which contains 30,000 audio recordings. -
Convolutional Neural Networks for Speech Recognition
The Speech Recognition dataset is used for speech recognition tasks.