-
UAV20L: A Benchmark for Visual Tracking
The UAV20L dataset is a benchmark for visual tracking. -
OTB-2013: A Benchmark for Visual Tracking
The OTB-2013 dataset is a benchmark for visual tracking. -
ShapeNet Annotated with Referring Expressions (SNARE)
A benchmark dataset for grounding natural language referring expressions to distinguish 3D objects. -
ACE and CoNLL04
The ACE and CoNLL04 datasets are widely used entity-relation extraction benchmarks. -
SciMT-Safety
The SciMT-Safety dataset is a benchmark for evaluating the safety of AI systems in science. It consists of hundreds of refined red-teaming queries that span the fields of... -
NAS-Bench-360
A benchmark for neural architecture search. -
OpenML Benchmark
A benchmark for automated machine learning. -
PennML Benchmark Suite
The PennML benchmark suite consists of over 90 regression problems and provides a performance overview of several common regression algorithms. -
CEC2013 Benchmark Functions
The dataset used in this paper is the CEC2013 benchmark functions. -
Guard: A safe reinforcement learning benchmark
The dataset used in the paper is a collection of robot locomotion tasks with various constraints. -
Building a conversational agent overnight with dialogue self-play
The Building a conversational agent overnight with dialogue self-play dataset is a benchmark for conversational AI. -
AI-Feynman database
The AI-Feynman database is a widely used public benchmark for symbolic regression. -
Keijzer benchmark
The Keijzer benchmark is a widely used public benchmark for symbolic regression. -
Nguyen benchmark
The Nguyen benchmark is a widely used public benchmark for symbolic regression.