Arena-Hard

The dataset used in this paper is a large-scale dataset for evaluating LLMs, which is used to train and evaluate the Arena-Hard model.

Data and Resources

Cite this as

Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica (2024). Dataset: Arena-Hard. https://doi.org/10.57702/e3wlulb5

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2407.10627
Author Tianle Li
More Authors
Wei-Lin Chiang
Evan Frick
Lisa Dunlap
Tianhao Wu
Banghua Zhu
Joseph E. Gonzalez
Ion Stoica
Homepage https://arxiv.org/abs/2406.11939