9 datasets found

Filter Results
  • INFOLOSSQA

    INFOLOSSQA is a dataset for characterizing and recovering simplification-induced information loss in form of question-and-answer (QA) pairs.
  • WebSplit

    The WebSplit dataset is a benchmark for the Split and Rephrase task, consisting of RDF semantic tuples.
  • Split and Rephrase Benchmark (SRB) and WikiSplit

    The dataset used in the paper is the Split and Rephrase Benchmark (SRB) and WikiSplit, which are used to evaluate the readability of sentence splitting.
  • Text Simplification Datasets: Exploration

    Text Simplification datasets have limitations and need to be improved to build more robust models.
  • A New Aligned Simple German Corpus

    A new sentence-aligned monolingual corpus for Simple German – German. It contains multiple document-aligned sources which we have aligned using automatic sentence-alignment...
  • WikiSplit

    A large dataset of naturally occurring sentence rewrites from Wikipedia edit history, providing sixty times more distinct split examples and a ninety times larger vocabulary...
  • Newsela

    The dataset is used to evaluate the proposed discourse-aware text simplification approach.
  • WikiLarge

    The dataset is used to evaluate the proposed discourse-aware text simplification approach.
  • Wiki-Auto

    The Wiki-Auto dataset is a text simplification dataset.