LJSpeech and VCTK datasets

The LJSpeech dataset contains 13,100 22kHz audio clips of a female speaker. The VCTK dataset consists of 108 native English speakers with various accents.

Data and Resources

Cite this as

LJSpeech, VCTK (2024). Dataset: LJSpeech and VCTK datasets. https://doi.org/10.57702/b1bhz24j

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Author LJSpeech
More Authors
VCTK
Homepage https://arxiv.org/abs/2103.04922