Speech Separation - Groups

Binaural Angular Separation Network

A neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones.

Dataset
JSON

VOiCES

The VOiCES dataset is used for testing the speaker recognition system. The dataset contains 7323 identities combined. The dataset is used for testing.

Dataset
JSON

Explainable Deep Clustering for Monaural Speech Separation

The proposed X-DC model uses a dataset of mixed speech signals of two, four, or eight speakers.

Dataset
JSON

WSJ0-2mix

The dataset used in the paper is the WSJ0-2mix dataset, which contains 30 hours of training data and 10 hours of validation data generated from the WSJ0 dataset. The speech...

Dataset
JSON

Continuous speech separation: Dataset and analysis

Continuous speech separation: Dataset and analysis.

Dataset
JSON

CHiME-2

The CHiME-2 dataset is a speech separation and recognition challenge dataset. It contains 7138 utterances of 8 speakers, each with 10 seconds of speech.

Dataset
JSON

LibriMix

The LibriMix dataset is a large corpus of mixed speech, designed for training and evaluating speech separation systems.

Dataset
JSON

LRS2

The LRS2 dataset consists of 48,164 video clips from outdoor shows on BBC television. Each video is accompanied by an audio corresponding to a sentence with up to 100 characters.

Dataset
JSON

Generative Pre-Training for Speech

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate...

Dataset
JSON

WHAM!

The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.

Dataset
JSON

10 datasets found