Reward Model Ensembles

doi:doi:10.57702/axhbsmh3

You're currently viewing an old version of this dataset. To see the current version, click here.

Reward Model Ensembles

The authors used three datasets: TL;DR, HELPFULNESS, and XSUM/NLI.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D’Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant (2024). Dataset: Reward Model Ensembles. https://doi.org/10.57702/axhbsmh3

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2312.09244
Author	Jacob Eisenstein
More Authors	Chirag Nagpal Alekh Agarwal Ahmad Beirami Alex D’Amour DJ Dvijotham Adam Fisch Katherine Heller Stephen Pfohl Deepak Ramachandran Peter Shaw Jonathan Berant