2 datasets found

Groups: Question Answering Formats: JSON

Filter Results
  • Disc-medllm

    Disc-medllm: Bridging general large language models and real-world medical consultation.
  • CMB-Exam

    A large-scale Chinese benchmark for evaluating medical large language models. The dataset consists of 280,839 samples, with 74 tasks, and covers 24 departments and 150 diseases.