-
Unsupervised word segmentation and lexicon discovery using acoustic word embe...
A dataset for the Zero Resource Speech Challenge 2015. -
Fixed-dimensional acoustic embeddings of variable-length segments in low-reso...
A dataset for the Zero Resource Speech Challenge 2015. -
The Zero Resource Speech Challenge 2015
A dataset for the Zero Resource Speech Challenge 2015. -
A segmental Bayesian framework for fully-unsupervised large-vocabulary speech...
A segmental Bayesian model for full-coverage segmentation and clustering of conversational speech audio. -
Open Subtitles dataset
The Open Subtitles dataset consists of transcriptions of spoken dialog in movies and television shows. -
IPA Transcription of Bengali Texts
A comprehensive study of IPA transcription issues and challenges for Bangla, a novel IPA transcription framework, a DUAL-IPA, a sentence level ipa transcripted parallel corpus... -
Corpus of Spoken Dutch
The Corpus of Spoken Dutch (CGN) is a dataset of spoken Dutch recordings. -
Language Models of Spoken Dutch
The dataset consists of subtitles of television shows provided by the Flemish public-service broadcaster VRT. The dataset is used to train language models of spoken Dutch. -
Sanskrit ASR dataset
A dataset for Sanskrit ASR -
वाक् सञ्चयः (/Vāksañcayah ̣/)
A new Sanskrit speech corpus and a large-vocabulary ASR system for Sanskrit -
Masked Acoustic Unit for Mispronunciation Detection and Correction
The proposed method uses the acoustic unit (AU) as the intermediary feature for both mispronunciation detection and correction. -
English and Luganda datasets for ASR-free keyword spotting
South African English and Luganda datasets -
Feature learning for efficient ASR-free keyword spotting in low-resource lang...
ASR-free keyword spotting in low-resource languages -
Google Speech Commands Dataset
The Google Speech Commands Dataset contains 64,727 one-second-long utterance files which are recorded and labeled with one of 30 target categories. -
Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
Keyword spotting (KWS) plays a critical role in enabling speech-based user interactions on smart devices. Recent developments in the field of deep learning have led to wide... -
Switchboard
Human speech data comprises a rich set of domain factors such as accent, syntactic and semantic variety, or acoustic environment.