-
Split and Rephrase Benchmark (SRB) and WikiSplit
The dataset used in the paper is the Split and Rephrase Benchmark (SRB) and WikiSplit, which are used to evaluate the readability of sentence splitting. -
Text Simplification Datasets: Exploration
Text Simplification datasets have limitations and need to be improved to build more robust models. -
A New Aligned Simple German Corpus
A new sentence-aligned monolingual corpus for Simple German – German. It contains multiple document-aligned sources which we have aligned using automatic sentence-alignment...