Dataset - LDM

WebSplit

The WebSplit dataset is a benchmark for the Split and Rephrase task, consisting of RDF semantic tuples.
- Dataset
- JSON
Split and Rephrase Benchmark (SRB) and WikiSplit

The dataset used in the paper is the Split and Rephrase Benchmark (SRB) and WikiSplit, which are used to evaluate the readability of sentence splitting.
- Dataset
- JSON
WikiSplit

A large dataset of naturally occurring sentence rewrites from Wikipedia edit history, providing sixty times more distinct split examples and a ninety times larger vocabulary...
- Dataset
- JSON
Newsela

The dataset is used to evaluate the proposed discourse-aware text simplification approach.
- Dataset
- JSON
WikiLarge

The dataset is used to evaluate the proposed discourse-aware text simplification approach.
- Dataset
- JSON
Wiki-Auto

The Wiki-Auto dataset is a text simplification dataset.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

6 datasets found