DocRepair dataset

The dataset used for testing the DocRepair model, containing 30m groups of 4 consecutive sentences in English and Russian.

BibTex: