Speech Separation - Groups

WSJ0-2mix

The dataset used in the paper is the WSJ0-2mix dataset, which contains 30 hours of training data and 10 hours of validation data generated from the WSJ0 dataset. The speech...

Dataset
JSON

CHiME-2

The CHiME-2 dataset is a speech separation and recognition challenge dataset. It contains 7138 utterances of 8 speakers, each with 10 seconds of speech.

Dataset
JSON

LRS2

The LRS2 dataset consists of 48,164 video clips from outdoor shows on BBC television. Each video is accompanied by an audio corresponding to a sentence with up to 100 characters.

Dataset
JSON

Generative Pre-Training for Speech

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate...

Dataset
JSON

WHAM!

The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.

Dataset
JSON

5 datasets found

WSJ0-2mix

CHiME-2

LRS2

Generative Pre-Training for Speech

WHAM!