-
SEAME corpus
SEAME corpus is a Mandarin-English code-switching speech corpus. -
TED-LIUM 3
TED-LIUM 3 (TL3) is a TED talks dataset. Speaker adaptation data for TL3 was divided randomly, where 2/5 was divided into the train set, 1/5 was divided into the dev set, and... -
GTZAN dataset
The GTZAN dataset is a small but popular dataset for genre classification, containing 10 musical genres, with each genre having 100 audio snippets of 30 s length. -
Free Spoken Digit Dataset
The dataset is a collection of 8kHz audio recordings of spoken digits from 'zero' to 'nine'. -
Lwazi speech corpus
Collecting and evaluating speech recognition corpora for nine southern bantu languages -
NCHLT speech corpus
The NCHLT speech corpus of the South African languages -
Attention-based beamformers for multi-channel speech recognition
The proposed 2D Conv-Attention model is compared with a traditional neural beamformer and multi-head attention based model. -
TIMIT dataset
The dataset used in this paper is a collection of phonetically and phonologically local allophonic distribution in English, where voiceless stops surface as aspirated... -
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognitio...
The Kazakh speech corpus (KSC) contains around 332 hours of transcribed audio comprising over 153,000 utterances spoken by participants from different regions and age groups, as... -
Libri-Light
The dataset used in the paper is the Libri-Light dataset, which is a subset of the LibriSpeech dataset. The authors used this dataset to pre-train their proposed dual-mode ASR... -
Radio browsing for developmental monitoring in Uganda
Radio browsing system for humanitarian relief