Audio Dataset - Groups - LDM

VoxCeleb: A Large-Scale Speaker Identification Dataset

VoxCeleb: A Large-Scale Speaker Identification Dataset
- Dataset
- JSON
AudioCaps

Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality.
- Dataset
- JSON
Clotho

Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.
- Dataset
- JSON

Before browse our site, please accept our cookies policy