Dataset - LDM

Individual Text Corpora Predict Openness, Interests, Knowledge and Level of E...

Individual text corpora (ICs) were generated from 214 participants with a mean number of 5 million word tokens.
- Dataset
- JSON
PEARL

Conversational recommendation dataset synthesized with persona- and knowledge-augmented LLM simulators.
- Dataset
- JSON
Wikipedia Corpus

The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found