The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three mathematics tasks (GSM8K, ASDIV, and SVAMP) and nine general-purpose reasoning tasks.
BibTex:
Before browse our site, please accept our cookies policy