Datasets Activity Stream About Order by Relevance Name Ascending Name Descending Last Modified Go 5 datasets found Formats: JSON Filter Results WavCaps The WavCaps dataset contains chatGPT-assisted weakly-labeled audio captioning data. Dataset JSON Clotho v2 Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences. Dataset JSON AudioCaps Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality. Dataset JSON Clotho Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences. Dataset JSON Clotho: An audio captioning dataset Audio captioning is a multi-modal task, focusing on using natural language for describing the contents of general audio. Most audio captioning methods are based on deep neural... Dataset JSON