Dataset - LDM

Wall Street Journal (WSJ) dataset

The Wall Street Journal (WSJ) dataset is a standard benchmark dataset for coherence modeling.
- Dataset
- JSON
A Cross-Domain Transferable Neural Coherence Model

Coherence is an important aspect of text quality and is crucial for ensuring its readability. The proposed coherence model is simple in structure, yet it significantly...
- Dataset
- JSON
Wikipedia dataset

The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries...
- Dataset
- JSON
Sparse Watermarking in LLMs with Enhanced Text Quality

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the ELI5, FinanceQA, MultiNews, and QMSum datasets.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

4 datasets found