BABEL

You're currently viewing an old version of this dataset. To see the current version, click here.

The BABEL dataset is a multilingual speech recognition dataset containing over 1,000 hours of speech from 6 languages.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Haodong Duan, Jiaqi Wang, Kai Chen, Dahua Lin (2024). Dataset: BABEL. https://doi.org/10.57702/3tt2d3qo

DOI retrieved: December 2, 2024

Field	Value
Created	December 2, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.2210.05895
Citation	https://doi.org/10.48550/arXiv.2312.15004 https://doi.org/10.48550/arXiv.2012.11896 https://doi.org/10.48550/arXiv.2310.14907
Author	Haodong Duan
More Authors	Jiaqi Wang Kai Chen Dahua Lin
Homepage	https://babel.cs.cmu.edu/