5 datasets found

Formats: JSON

Filter Results
  • WavCaps

    The WavCaps dataset contains chatGPT-assisted weakly-labeled audio captioning data.
  • Clotho v2

    Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.
  • AudioCaps

    Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality.
  • Clotho

    Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.
  • Clotho: An audio captioning dataset

    Audio captioning is a multi-modal task, focusing on using natural language for describing the contents of general audio. Most audio captioning methods are based on deep neural...