Alpaca Eval 2

The dataset used in the paper is Alpaca Eval 2, which is an automated metric that measures LLMs' alignment with human preferences.

BibTex: