9 datasets found

Filter Results
  • VOiCES

    The VOiCES dataset is used for testing the speaker recognition system. The dataset contains 7323 identities combined. The dataset is used for testing.
  • Explainable Deep Clustering for Monaural Speech Separation

    The proposed X-DC model uses a dataset of mixed speech signals of two, four, or eight speakers.
  • WSJ0-2mix

    The dataset used in the paper is the WSJ0-2mix dataset, which contains 30 hours of training data and 10 hours of validation data generated from the WSJ0 dataset. The speech...
  • Continuous speech separation: Dataset and analysis

    Continuous speech separation: Dataset and analysis.
  • CHiME-2

    The CHiME-2 dataset is a speech separation and recognition challenge dataset. It contains 7138 utterances of 8 speakers, each with 10 seconds of speech.
  • LibriMix

    The LibriMix dataset is a large corpus of mixed speech, designed for training and evaluating speech separation systems.
  • LRS2

    The LRS2 dataset consists of 48,164 video clips from outdoor shows on BBC television. Each video is accompanied by an audio corresponding to a sentence with up to 100 characters.
  • Generative Pre-Training for Speech

    Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate...
  • WHAM!

    The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.