1 dataset found

Tags: video descriptions

Filter Results
  • AVSD dataset

    The AVSD dataset is a benchmark for audio-visual scene-aware dialog. It consists of 7659 training, 734 prototype validation, and 733 prototype testing dialog, where the...
You can also access this registry using the API (see API Docs).