1 dataset found

Tags: object vocabulary

Filter Results
  • VisSpeech

    The dataset used for the audio-visual speech recognition task, which consists of instructional videos with semantically related visual content.
You can also access this registry using the API (see API Docs).