Penn Treebank

The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths.

Data and Resources

Cite this as

Mitchell P Marcus, Mary Ann Marcinkiewicz, Beatrice Santorini (2024). Dataset: Penn Treebank. https://doi.org/10.57702/u5gg4t6i

DOI retrieved: November 25, 2024

Additional Info

Field Value
Created November 25, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.1808.05908
Citation
  • https://doi.org/10.48550/arXiv.1511.06349
  • https://doi.org/10.48550/arXiv.1708.08863
  • https://doi.org/10.48550/arXiv.2106.10704
  • https://doi.org/10.48550/arXiv.2001.03294
  • https://doi.org/10.48550/arXiv.1706.03993
  • https://doi.org/10.48550/arXiv.1909.03004
  • https://doi.org/10.48550/arXiv.2107.08382
  • https://doi.org/10.1109/IJCNN.2019.8852464
  • https://doi.org/10.48550/arXiv.1711.09873
  • https://doi.org/10.48550/arXiv.1807.09830
  • https://doi.org/10.48550/arXiv.1802.02116
  • https://doi.org/10.48550/arXiv.2005.00054
  • https://doi.org/10.48550/arXiv.1711.04755
  • https://doi.org/10.48550/arXiv.1809.03702
  • https://doi.org/10.48550/arXiv.1606.01280
  • https://doi.org/10.48550/arXiv.1704.02798
  • https://doi.org/10.48550/arXiv.1909.03569
Author Mitchell P Marcus
More Authors
Mary Ann Marcinkiewicz
Beatrice Santorini
Homepage https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank.html