2 datasets found

Tags: cross-modal translation

Filter Results
  • Clotho v2

    Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.
  • Clotho

    Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.