Dataset - LDM

Amazon Alexa Dataset

A 23 thousand hour corpus of untranscribed, de-identified, far-field, English voice command and voice query speech.
- Dataset
- JSON
Open Subtitles dataset

The Open Subtitles dataset consists of transcriptions of spoken dialog in movies and television shows.
- Dataset
- JSON
Loss Prediction: End-to-End Active Learning for Speech Recognition

End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive. Active learning is the...
- Dataset
- JSON
Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech...

This paper presents a well-known music identification method and implements it as a neural net.
- Dataset
- JSON
LRW

The LRW dataset is an English language lip reading dataset, containing 500 different words, each spoken by over 1,000 persons.
- Dataset
- JSON
AISHELL-1

The AISHELL-1 dataset is a Mandarin speech corpus, consisting of 178 hours of speech, with 11 domains and 400 speakers from different accent areas in China.
- Dataset
- JSON
BABEL-Pashto

The BABEL-Pashto dataset is a multilingual speech recognition dataset containing Pashto speech recordings.
- Dataset
- JSON
asya

asya is a mobile application that listens to a person’s voice and provides private feedback on a person’s verbal communication.
- Dataset
- JSON
Speech Intelligibility Prediction with DNN-based Performance Measures

The dataset used for speech intelligibility prediction with DNN-based performance measures
- Dataset
- JSON
DNS-5 dataset

The dataset used in the paper is a benchmarking dataset for speech-to-speech translation.
- Dataset
- JSON
Bengali Medical Corpus

A comprehensive 46-hour Bengali medical corpus encompassing disease names, symptoms, and symptom severity.
- Dataset
- JSON
Highly-Reverberant Real Environment database (HRRE)

Highly-Reverberant Real Environment database (HRRE) contains 13.4 hours of data recorded in real reverberant environments and consists of 20 different testing conditions.
- Dataset
- JSON
TEDLIUM2

The TEDLIUM2 dataset is a large corpus of audio recordings of human speech, with a focus on speech recognition tasks.
- Dataset
- JSON
Fleurs

Few-shot learning evaluation of universal representations of speech
- Dataset
- JSON
ALLSSTAR

Large-scale dataset of L1 and L2 scripted and spontaneous transcripts and recordings
- Dataset
- JSON
Eesen

The Eesen dataset is a speech recognition dataset used in the Eesen framework.
- Dataset
- JSON
OpenSeq2Seq

The OpenSeq2Seq dataset is a speech recognition dataset used in the OpenSeq2Seq framework.
- Dataset
- JSON
ESPNet

The ESPNet dataset is a speech recognition dataset used in the ESPNet framework.
- Dataset
- JSON
Kaldi Speech Recognition Toolkit

The Kaldi Speech Recognition Toolkit is a widely used dataset for speech recognition.
- Dataset
- JSON
WAV2LETTER++

The dataset used in this paper is not explicitly mentioned, but it is implied to be a speech recognition dataset.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

97 datasets found