Dataset - LDM

Explainable Deep Clustering for Monaural Speech Separation

The proposed X-DC model uses a dataset of mixed speech signals of two, four, or eight speakers.
- Dataset
- JSON
LibriMix

The LibriMix dataset is a large corpus of mixed speech, designed for training and evaluating speech separation systems.
- Dataset
- JSON
LRS2

The LRS2 dataset consists of 48,164 video clips from outdoor shows on BBC television. Each video is accompanied by an audio corresponding to a sentence with up to 100 characters.
- Dataset
- JSON
WHAM!

The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

4 datasets found