TIMIT, Aurora-4, AMI, and LibriSpeech

doi:doi:10.57702/nwuk67kc

TIMIT, Aurora-4, AMI, and LibriSpeech

Four different corpora are used for our experiments, which are TIMIT, Aurora-4, AMI, and LibriSpeech. TIMIT contains broadband 16kHz recordings of phonetically-balanced read speech. Aurora-4 is another broadband corpus designed for noisy speech recognition tasks, based on the Wall Street Journal corpus (WSJ0). AMI corpus consists of 100 hours of meeting recordings, recorded in three different meeting rooms with different acoustic properties, and with multiple microphones. LibriSpeech corpus contains 1,000 hours of read speech sampled at 16kHz.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Wei-Ning Hsu, James Glass (2024). Dataset: TIMIT, Aurora-4, AMI, and LibriSpeech. https://doi.org/10.57702/nwuk67kc

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.1804.03201
Author	Wei-Ning Hsu
More Authors	James Glass
Homepage	https://github.com/wnhsu/ScalableFHVAE