-
ChroniclingAmericaQA
ChroniclingAmericaQA is a large-scale question-answering dataset comprising 487k question-answer pairs over a collection of historical American newspapers with the objective of... -
Seungjeongwon Corpus
The Seungjeongwon corpus is a historical corpus that contains the diary of a royal secretary from the Joseon Dynasty, with annotated named entities and punctuation markers. -
Pre-Modern Japanese Text Dataset
Pre-modern Japanese text dataset for character recognition -
KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning
Pre-modern Japanese text dataset for character recognition