Massive Multitask Language Understanding (MMLU) dataset

The MMLU dataset is a benchmark for measuring the behavior of large language models on a number of tasks. It consists of 15908 multiple choice questions distributed across 57 subject areas.

BibTex: