Clotho

Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.

BibTex: