MuST-C is a multilingual speech translation dataset, which contains at least 385 hours of audio recordings from TED Talks, with their manual transcriptions and translations at the sentence level.
BibTex:
Before browse our site, please accept our cookies policy