-
Learning to summarize with human feedback
The paper presents a study on the impact of synthetic data on large language models (LLMs) and proposes a method to steer LLMs towards desirable non-differentiable attributes. -
Simulated Gaussian mixture models
Simulated data consisting of a mixture of three Gaussians. Only one of the Gaussians was shifted between domains (dark blue), but differential sampling rates prevent traditional...