Dataset - LDM

The AMI meeting corpus: A pre-announcement

The AMI meeting corpus: A pre-announcement
- Dataset
- JSON
Fixed-dimensional acoustic embeddings of variable-length segments in low-reso...

A dataset for the Zero Resource Speech Challenge 2015.
- Dataset
- JSON
The Zero Resource Speech Challenge 2015

A dataset for the Zero Resource Speech Challenge 2015.
- Dataset
- JSON
KWS-DailyTalk

KWS-DailyTalk is a five-shot KWS dataset aimed at detecting 15 different keywords, namely “afternoon”, “airport”, “cash”, “credit card”, “deposit”, “dollar”, “evening”,...
- Dataset
- JSON
Whisper

Whisper is a general-purpose speech recognition model.
- Dataset
- JSON
WIT3 Parallel Corpus

The WIT3 parallel corpus is a large-scale corpus of transcribed and translated talks.
- Dataset
- JSON
VoxForge dataset

The VoxForge dataset is a collection of audio recordings of human speech.
- Dataset
- JSON
Isolet dataset

The dataset used in this paper is the Isolet dataset, which contains 4,000 13-channel audio recordings of 100 speakers.
- Dataset
- JSON
Hub5e-swb Dataset

The Hub5e-swb dataset is a dataset of speech recordings from a hub5e-swb device, which is a device that allows multiple people to speak at the same time.
- Dataset
- JSON
Resource Management Audio-Visual (RMAV) dataset

The RMAV dataset consists of 20 British English speakers up to 200 utterances per speaker of the Resource Management (RM) sentences.
- Dataset
- JSON
AVLetters-2 (AVL2) dataset

The AVL2 dataset consists of seven utterances per speaker reciting the alphabet.
- Dataset
- JSON
End-to-End Neural Speaker Diarization with Permutation-Free Objectives

The End-to-End Neural Speaker Diarization dataset is a benchmark for speaker diarization.
- Dataset
- JSON
The Third DIHARD Diarization Challenge

The DIHARD dataset is a benchmark for speaker diarization.
- Dataset
- JSON
Perception of Phonological Assimilation

The dataset used in this study consists of 48 stimuli, each containing a word pair with a place assimilation, and a carrier sentence. The stimuli are designed to test the...
- Dataset
- JSON
FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge

Speech recognition systems driven by Deep Neural Networks (DNNs) have revolutionized human-computer interaction through voice interfaces, which significantly facilitate our...
- Dataset
- JSON
Corpus of Spoken Dutch

The Corpus of Spoken Dutch (CGN) is a dataset of spoken Dutch recordings.
- Dataset
- JSON
Language Models of Spoken Dutch

The dataset consists of subtitles of television shows provided by the Flemish public-service broadcaster VRT. The dataset is used to train language models of spoken Dutch.
- Dataset
- JSON
NIST RT-03 English CTS

The dataset is used for speaker diarization tasks.
- Dataset
- JSON
AudioMNIST dataset

The dataset used in the paper is the AudioMNIST dataset, which contains 30,000 audio recordings.
- Dataset
- JSON
MLS

MLS: A large-scale multilingual dataset for speech research.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

41 datasets found