1 dataset found

Tags: Multi-modal Transformer

Filter Results
  • VATEX

    The dataset used in the paper is a video question answering dataset, which is a large-scale video-language pre-training task.
You can also access this registry using the API (see API Docs).