3 datasets found

Tags: multimodal dialogue systems

Filter Results
  • FaceChat

    FaceChat is a web-based face-to-face and emotion-aware dialogue framework that integrates several advanced modules, including a web module, an engagement module, an ASR module,...
  • Video Instruction Data

    A video-centric instruction dataset, composed of 7K detailed video descriptions and 4K video conversations.
  • Ask-Anything

    A video-centric multimodal instruction dataset, composed of thousands of videos associated with detailed descriptions and conversations.