-
Speech EEG Database
Two simultaneous speech EEG recording databases for this work. For database A five female and five male subjects took part in the experiment. For database B five male and three... -
LibriLight: A Benchmark for ASR with Limited or No Supervision
The LibriLight dataset is a large-scale speech corpus used for self-supervised speech recognition tasks. -
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM
The TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM is a widely used dataset for speech recognition tasks. -
UNSUPERVISED SPEECH RECOGNITION WITH N-SKIPGRAM AND POSITIONAL UNIGRAM MATCHING
Training unsupervised speech recognition systems presents challenges due to GAN-associated instability, misalignment between speech and text, and significant memory demands. To... -
TI-46 Spoken Digits Recognition
The TI-46 spoken digits dataset comprises of 5 speakers uttering 10 times each of the 10 digits (500 samples) -
The REPERE Corpus: a multimodal corpus for person recognition
The REPERE Corpus: a multimodal corpus for person recognition contains TV broadcasts. -
The ESTER phase II evaluation campaign for the rich transcription of French b...
The ESTER phase II evaluation campaign for the rich transcription of French broadcast news contains news reports. -
The ETAPE corpus for the evaluation of speech-based TV content processing in ...
The ETAPE corpus for the evaluation of speech-based TV content processing in the French language contains TV broadcasts. -
DARPA TIMIT acoustic-phonetic continuous speech corpus CD-ROM
The TIMIT acoustic-phonetic continuous speech corpus CD-ROM contains read speech from 2500 speakers. -
indic-punct
The dataset is used for automatic punctuation restoration and inverse text normalization for Indic languages. -
Stanford Neural Machine Translation Systems for Spoken Language Domain
Stanford neural machine translation systems for spoken language domain. -
Arabic Digits Dataset
The dataset used in this paper is a dataset for spoken digit recognition of Arabic digits from 0 to 9. -
Unsupervised word segmentation and lexicon discovery using acoustic word embe...
A dataset for the Zero Resource Speech Challenge 2015. -
Fixed-dimensional acoustic embeddings of variable-length segments in low-reso...
A dataset for the Zero Resource Speech Challenge 2015. -
The Zero Resource Speech Challenge 2015
A dataset for the Zero Resource Speech Challenge 2015. -
A segmental Bayesian framework for fully-unsupervised large-vocabulary speech...
A segmental Bayesian model for full-coverage segmentation and clustering of conversational speech audio. -
Amazon Alexa Dataset
A 23 thousand hour corpus of untranscribed, de-identified, far-field, English voice command and voice query speech. -
DeepSpeech
The DeepSpeech dataset used for evaluation of the proposed watermarking scheme.