1 dataset found

Tags: video language understanding

Filter Results
  • EgoSchema

    EgoSchema is a diagnostic benchmark for assessing very long-form video-language understanding capabilities of modern multimodal systems.