PennTreebank

The PennTreebank dataset is used for language modeling, containing a large annotated corpus of English text to evaluate the task of predicting the next character or word based on context.

BibTex: