NAF is a dataset for form understanding in historical scanned documents. It contains 165 document images with annotated text areas of various categories.
FUNSD is a dataset for form understanding in noisy scanned documents. It contains 199 document images with annotated text areas of four categories: key, value, header, and other.