Okapi

The dataset is used for instruction-tuning of LLMs in multiple languages using reinforcement learning from human feedback.

BibTex: