Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

New Natural Language Process (NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present Xiezhi, the most comprehensive evaluation suite designed to assess holistic domain knowledge.

BibTex: