-
Phi-2: A Dataset for Language Model Evaluation
The Phi-2 dataset is a collection of language models used to evaluate the performance of language models. -
MBPP: A Dataset for Language Model Evaluation
The MBPP dataset is a collection of basic programming questions used to evaluate the performance of language models.