Penn Treebank corpus

The Penn Treebank corpus contains 49,208 sentences and over 1 million words, and is used to test the proposed algorithm on a real-world dataset.

Data and Resources

Cite this as

Peter Macgregor, He Sun (2025). Dataset: Penn Treebank corpus. https://doi.org/10.57702/cbes8y63

DOI retrieved: January 2, 2025

Additional Info

Field Value
Created January 2, 2025
Last update January 2, 2025
Defined In https://doi.org/10.48550/arXiv.2212.14345
Author Peter Macgregor
More Authors
He Sun