Dataset - LDM

VisSpeech

The dataset used for the audio-visual speech recognition task, which consists of instructional videos with semantically related visual content.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

Before browse our site, please accept our cookies policy