-
ICDAR 2019 Competition on Scanned Receipt OCR and Information Extraction
A dataset for scanned receipt OCR and information extraction, focusing on key information detection and OCR tasks. -
CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset
A comprehensive dataset for post-OCR parsing and receipt understanding, specifically designed to enhance OCR and information extraction from receipts in multilingual contexts... -
ICDAR-2015 Robust Reading Competition
The ICDAR-2015 Robust Reading Competition dataset contains images with text in various fonts, sizes, and orientations. -
ICDAR 2013
ICDAR 2013 consists of 229 training images and 233 testing images, and similar to ICDAR 2015, it also provides "Strong", "Weak" and "Generic" lexicons for text spotting task.... -
SPAN: a Simple Predict & Align Network for Handwritten Paragraph Recognition
The proposed model performs OCR at paragraph level, without any prior segmentation stage. -
ICDAR 2013 Robust Reading Competition
The ICDAR 2013 robust reading competition dataset. -
Responsa Project dataset
The Responsa Project dataset consists of more than 3M annotated letters from the Responsa Project dataset. -
OCR dataset
The OCR dataset is a dataset of handwritten digits, each image is an 8x16 binary image, and there are 52152 samples in total. -
Bengali word segmentation
Bengali handwritten word segmentation dataset -
Bengali OCR
Bengali handwritten character recognition dataset -
BanglaWritting
Bengali handwritten word images dataset -
ICDAR dataset
The ICDAR dataset is a dataset of handwritten digits. -
Realistic Expiry Date Dataset
The dataset consists of 3000 samples of realistic dates dataset covering the years 2019 to 2027, used for testing the model. -
Synthetic Expiry Date Dataset
The dataset consists of 60,000 samples of unrealistic expiry dates with the corresponding filled-in expiry dates that incorporates more samples for training the model.