Speech Separation - Groups

Explainable Deep Clustering for Monaural Speech Separation

The proposed X-DC model uses a dataset of mixed speech signals of two, four, or eight speakers.

Dataset
JSON

LibriMix

The LibriMix dataset is a large corpus of mixed speech, designed for training and evaluating speech separation systems.

Dataset
JSON

LRS2

The LRS2 dataset consists of 48,164 video clips from outdoor shows on BBC television. Each video is accompanied by an audio corresponding to a sentence with up to 100 characters.

Dataset
JSON

WHAM!

The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.

Dataset
JSON

4 datasets found

Explainable Deep Clustering for Monaural Speech Separation

LibriMix

LRS2

WHAM!