Massive Multitask Language Understanding (MMLU) dataset
The MMLU dataset is a benchmark for measuring the behavior of large language models on a number of tasks. It consists of 15908 multiple choice questions distributed across 57 subject areas.
BibTex: