Datasets Activity Stream About Order by Relevance Name Ascending Name Descending Last Modified Go 4 datasets found Tags: document layout analysis Filter Results DiT DiT: Self-supervised pre-training for document image Transformer. Dataset JSON DocBank DocBank consists of 500K document layouts by weak supervision of articles available on the arXiv.com. Dataset JSON D4LA A new benchmark named D4LA, which is the most diverse and detailed manually-labeled dataset for document layout analysis. Dataset JSON PubLayNet dataset The PubLayNet dataset is the largest dataset ever for document layout analysis. Dataset JSON