Latent Distance Guided Alignment Training for Large Language Models

doi:doi:10.57702/8jb217g1

Latent Distance Guided Alignment Training for Large Language Models

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

Latent Distance Guided Alignment Training for Large Language Models

Ensuring alignment with human preferences is a crucial characteristic of large language models (LLMs). Presently, the primary alignment methods, RLHF and DPO, require extensive human annotation, which is expensive despite their efficacy.

BibTex:

Before browse our site, please accept our cookies policy