A general theoretical paradigm to understand learning from human preferences

doi:doi:10.57702/lafwgps7

A general theoretical paradigm to understand learning from human preferences

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

A general theoretical paradigm to understand learning from human preferences

The paper proposes a novel approach to aligning language models with human preferences, focusing on the use of preference optimization in reward-free RLHF.

BibTex:

Before browse our site, please accept our cookies policy