Dataset - LDM

Rank-1 Constrained Multichannel Wiener Filter for Speech Recognition in Noisy...

Multichannel linear filters for speech recognition in noisy environments
- Dataset
- JSON
English Broadcast News (BN) dataset

The dataset used in this paper is the English Broadcast News (BN) dataset.
- Dataset
- JSON
IMPROVEMENTS TO DEEP CONVOLUTIONAL NEURAL NETWORKS FOR LVCSR

Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Neural Networks (DNN), as they are able to better reduce spectral variation in the input signal.
- Dataset
- JSON
Error Explainable Benchmark (EEB) dataset

The proposed Error Explainable Benchmark (EEB) dataset, which considers both speech- and text-level error types, to diagnose and validate ASR models and post-processors.
- Dataset
- JSON
CoVoST2

The dataset used for the speech translation task, which consists of multilingual speech data.
- Dataset
- JSON
SLR41 and SLR44 datasets

The SLR41 and SLR44 datasets consist of pairs of audio recordings and corresponding transcripts.
- Dataset
- JSON
SLR35 and SLR36 datasets

The SLR35 and SLR36 datasets consist of 200,000 speech recordings from native speakers.
- Dataset
- JSON
Magic Data

The Magic Data dataset consists of 3.5 hours of Indonesian scripted speeches from 10 people.
- Dataset
- JSON
TITML-IDN, Magic Data, Common Voice, SLR35, SLR36, SLR41, and SLR44 datasets

The study uses the TITML-IDN, Magic Data, Common Voice, SLR35, SLR36, SLR41, and SLR44 datasets for training and evaluation of the ASR system.
- Dataset
- JSON
Speech EEG Database

Two simultaneous speech EEG recording databases for this work. For database A five female and five male subjects took part in the experiment. For database B five male and three...
- Dataset
- JSON
LibriLight: A Benchmark for ASR with Limited or No Supervision

The LibriLight dataset is a large-scale speech corpus used for self-supervised speech recognition tasks.
- Dataset
- JSON
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM

The TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM is a widely used dataset for speech recognition tasks.
- Dataset
- JSON
UNSUPERVISED SPEECH RECOGNITION WITH N-SKIPGRAM AND POSITIONAL UNIGRAM MATCHING

Training unsupervised speech recognition systems presents challenges due to GAN-associated instability, misalignment between speech and text, and significant memory demands. To...
- Dataset
- JSON
TI-46 Spoken Digits Recognition

The TI-46 spoken digits dataset comprises of 5 speakers uttering 10 times each of the 10 digits (500 samples)
- Dataset
- JSON
The ESTER phase II evaluation campaign for the rich transcription of French b...

The ESTER phase II evaluation campaign for the rich transcription of French broadcast news contains news reports.
- Dataset
- JSON
Stanford Neural Machine Translation Systems for Spoken Language Domain

Stanford neural machine translation systems for spoken language domain.
- Dataset
- JSON
Arabic Digits Dataset

The dataset used in this paper is a dataset for spoken digit recognition of Arabic digits from 0 to 9.
- Dataset
- JSON
Amazon Alexa Dataset

A 23 thousand hour corpus of untranscribed, de-identified, far-field, English voice command and voice query speech.
- Dataset
- JSON
Open Subtitles dataset

The Open Subtitles dataset consists of transcriptions of spoken dialog in movies and television shows.
- Dataset
- JSON
Loss Prediction: End-to-End Active Learning for Speech Recognition

End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive. Active learning is the...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

114 datasets found