-
MSRA-TD500
The MSRA-TD500 dataset is a benchmark for scene text detection, containing 700 training images and 200 test images, with multi-lingual, arbitrary-oriented and long text lines. -
Verisimilar Image Synthesis for Detection and Recognition of Texts
The proposed scene text image synthesis technique starts with two types of inputs including “Background Images” and “Source Texts” as illustrated in column 1 and 2 in Fig. 1. -
K-Watermark
A benchmark for watermark text spotting from documents and an end-to-end solution for detecting watermark text patterns and recognizing the depicted text. -
MARIO-LAION
The MARIO-LAION dataset is a subset of the LAION-400M dataset, containing 9,194,613 high-quality text images with corresponding captions. -
ICDAR-2015 Robust Reading Competition
The ICDAR-2015 Robust Reading Competition dataset contains images with text in various fonts, sizes, and orientations. -
ICDAR-2017 Robust Reading Competition
The ICDAR-2017 Robust Reading Competition dataset contains images with text in various fonts, sizes, and orientations. -
Total-Text
Total-Text is a dataset for word-level arbitrary-shaped English text detection, containing 1,255 images for training and 300 images for testing.