MT-bench

The dataset used in the paper is MT-bench, which is an LLM-based automated evaluation metric comprising 80 challenging questions.

BibTex: