1 dataset found

Tags: challenges

Filter Results
  • BIG-Bench Hard

    The BIG-Bench Hard dataset is derived from the original BIG-Bench evaluation suite, focusing on tasks that pose challenges to existing language models.
You can also access this registry using the API (see API Docs).