The PennTreebank dataset is used for language modeling, containing a large annotated corpus of English text to evaluate the task of predicting the next character or word based on context.
BibTex:
Before browse our site, please accept our cookies policy