2 datasets found

Tags: mathematical reasoning

Filter Results
  • MathVista

    MathVista is a benchmark for evaluating mathematical reasoning in visual contexts.
  • MATHCHECK

    MATHCHECK is a well-designed checklist for testing task generalization and reasoning robustness, along with an automatic tool for swiftly generating checklist for most math...
You can also access this registry using the API (see API Docs).