AstroMLab 1: Who Wins Astronomy Jeopardy!?

A comprehensive evaluation of proprietary and open-weights large language models using the first astronomy-specific benchmarking dataset.

BibTex: