-
Twins dataset
The Twins dataset is a real-world dataset used for benchmarking treatment effect estimation methods. -
IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuri...
IOHprofiler is a tool for analyzing and comparing iterative optimization heuristics. It provides statistical evaluations of the algorithms' performance by means of the... -
Diverse and Generative ML Benchmark (DIGEN)
A collection of 40 synthetic datasets designed to test the performance of common machine learning algorithms. -
MACHIAVELLI Benchmark
A dataset of traces from the MACHIAVELLI environment, including API calls and their outcomes. -
BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM ...
A structured collection of tests for input-output safeguards, including established failure tests, emerging failure tests, and next-gen architecture tests. -
AstroMLab 1: Who Wins Astronomy Jeopardy!?
A comprehensive evaluation of proprietary and open-weights large language models using the first astronomy-specific benchmarking dataset. -
Benchmark problems for gray-box optimization
The dataset used in this paper is a set of benchmark problems for gray-box optimization, including the Sphere function, Rosenbrock function, REBGrid function, and others. -
DiLiGenT Benchmark
A benchmark dataset for non-lambertian and uncalibrated photometric stereo. -
WebQuestions
The task of Question Answering over Linked Data (QALD) has received increased attention over the last years (see the surveys [14] and [36]). The task consists in mapping natural... -
CEC'2013 Special Session and Competition on Large-Scale Global Optimization
A benchmark for large-scale global optimization, featuring composite functions with varying sizes and complexities. -
OptimSuite
A broad benchmark suite for black-box optimization, covering a wide range of problems, including academic benchmarks, real-world applications, and discrete optimization problems. -
YCB Object and Model Set
The YCB object and model set is a benchmark for manipulation research, consisting of 15 object categories and 3D models. -
Benchmarking single-image dehazing and beyond
Benchmarking single-image dehazing and beyond -
MoleculeNet dataset
The MoleculeNet dataset is a benchmarking platform for molecular machine learning.