WSJ

The WSJ corpus is a large vocabulary continuous speech recognition dataset. It contains 36416 sequences, representing around 80 hours of speech.

BibTex: