Cite this as

Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, Daniele Calandriello (2024). Dataset: A general theoretical paradigm to understand learning from human preferences. Resource: Original Metadata. https://doi.org/10.57702/lafwgps7

DOI retrieved: December 16, 2024

Additional Information

Field Value
Created December 16, 2024
Last updated December 16, 2024
Format JSON