Cite this as

Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Michael I. Jordan, Jiantao Jiao (2024). Dataset: Fine-tuning Language Models with Advantage-Induced Policy Alignment. Resource: Original Metadata. https://doi.org/10.57702/3oqqdleq

DOI retrieved: December 16, 2024

Additional Information

Field Value
Created December 16, 2024
Last updated December 16, 2024
Format JSON