1 dataset found

Tags: MT-bench

Filter Results
  • MT-bench

    The dataset used in the paper is MT-bench, which is an LLM-based automated evaluation metric comprising 80 challenging questions.
You can also access this registry using the API (see API Docs).