5 datasets found

Tags: TED Talks

Filter Results
  • TED-LIUM 2

    Enhancing the TED-LIUM corpus with selected data for language modeling and more TED talks.
  • TEDLIUM Corpus

    The TEDLIUM corpus is a large-volume corpus used for speech recognition and text summarization.
  • TED Speech Summarization Corpus

    Speech summarization, which generates a text summary from speech, can be achieved by combining automatic speech recognition (ASR) and text summarization (TS).
  • LRS3

    The LRS3 dataset is a large-scale dataset for visual speech recognition. It consists of thousands of spoken sentences from TED videos.
  • MuST-C

    MuST-C is a multilingual speech translation dataset, which contains at least 385 hours of audio recordings from TED Talks, with their manual transcriptions and translations at...
You can also access this registry using the API (see API Docs).