Text Normalization - Groups

Polish texts for diachronic normalization

A dataset of Polish texts with historical and contemporary spellings, used for diachronic normalization.
- Dataset
- JSON
Normalized Ligurian Corpus

A dataset of 4,394 Ligurian sentences in different spelling systems paired with normalized versions.
- Dataset
- JSON

Before browse our site, please accept our cookies policy

2 datasets found