The dataset used in the paper is the Split and Rephrase Benchmark (SRB) and WikiSplit, which are used to evaluate the readability of sentence splitting.
A large dataset of naturally occurring sentence rewrites from Wikipedia edit history, providing sixty times more distinct split examples and a ninety times larger vocabulary...