LJSpeech and VCTK datasets

doi:doi:10.57702/b1bhz24j

LJSpeech and VCTK datasets

The LJSpeech dataset contains 13,100 22kHz audio clips of a female speaker. The VCTK dataset consists of 108 native English speakers with various accents.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

LJSpeech, VCTK (2024). Dataset: LJSpeech and VCTK datasets. https://doi.org/10.57702/b1bhz24j

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	LJSpeech
More Authors	VCTK
Homepage	https://arxiv.org/abs/2103.04922