5 datasets found

Groups: Multimodal Dialogue Systems Formats: JSON

Filter Results
  • FaceChat

    FaceChat is a web-based face-to-face and emotion-aware dialogue framework that integrates several advanced modules, including a web module, an engagement module, an ASR module,...
  • Video Instruction Data

    A video-centric instruction dataset, composed of 7K detailed video descriptions and 4K video conversations.
  • Ask-Anything

    A video-centric multimodal instruction dataset, composed of thousands of videos associated with detailed descriptions and conversations.
  • MMDialog

    MMDialog is a large-scale multi-turn dialogue dataset towards multi-modal open-domain conversation.
  • PhotoChat

    The dataset used in the paper is PhotoChat, a human-human dialogue dataset with photo sharing behavior for joint image-text modeling.