-
Wikipedia dataset
The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries... -
Sparse Watermarking in LLMs with Enhanced Text Quality
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the ELI5, FinanceQA, MultiNews, and QMSum datasets.