Dataset - LDM

RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging

The proposed RefXVC system uses multiple reference sources to capture the tonal variations in a speaker's speech more accurately.
- Dataset
- JSON
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding

The proposed Y-vector system is used for speaker verification and speaker embedding.
- Dataset
- JSON
VoxCeleb dataset

The VoxCeleb dataset is a large-scale speaker identification dataset, used to evaluate the performance of face recognition systems.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found