-
JARVIS-ML database
The JARVIS-ML database contains machine learning models for predicting material properties. -
Parameter, Compute and Data Trends in Machine Learning (PCD) database
The Parameter, Compute and Data Trends in Machine Learning (PCD) database. -
OpenAlex dataset
The dataset of publications affiliated with the top 25 companies by affiliations with the top-100,000 most cited AI and ML works, and notable ML systems. -
Dataset for Building Detection using Machine Learning
The dataset used for building detection using machine learning. -
Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in ...
The dataset used in this paper is a multidimensional cascade neural network with neuron pool optimization in each cascade. -
Labeled Dataset
Labeled dataset of hotspots/wildfires and not-hotspots/not-wildfires -
Linear Regression Models
The dataset used in the paper is a collection of linear regression models with varying dimensions of posterior. -
Stochastic Optimal Control Matching
The dataset used in the paper is a stochastic optimal control problem, where the goal is to drive the behavior of a noisy system. -
DMC4ML: Data Movement Complexity for Machine Learning
The dataset used in this paper for analyzing the memory cost of three machine learning algorithms: transformers, spatial convolution, and FFT. -
Lattice QCD datasets
The dataset used in this paper is a collection of lattice QCD simulations, specifically the three-point correlation function data of nucleon vector and axial-vector charges. -
Sparse Representation Learning with Modified q-VAE towards Minimal Realization...
The dataset used in this paper is a collection of high-dimensional observation data from cameras and LiDAR, used for training a world model. -
In Situ Framework for Coupling Simulation and Machine Learning with Applicati...
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for... -
Quantization of Distributed Data for Learning
The dataset used in this paper is a distributed dataset for learning, where the data is distributed over a trusted network and the communication constraints can create a... -
Performance of Machine Learning Classification in Mammography Images Using BI...
A comprehensive analysis of mammogram images to develop an enhanced understanding of different risk categories associated with breast cancer. -
A Data-Centric Optimization Framework for Machine Learning
DaCeML is a Data-Centric Machine Learning framework that provides a simple, flexible, and customizable pipeline for optimizing training of arbitrary deep neural networks. -
OptimSuite
A broad benchmark suite for black-box optimization, covering a wide range of problems, including academic benchmarks, real-world applications, and discrete optimization problems. -
Auto-MPG dataset
The dataset used in the paper is a real-world dataset, specifically the Auto-MPG dataset from the University of California, Irvine (UCI) database. -
Galaxy Zoo data releases
Interest in using machine learning for tasks such as galaxy segmentation, and deblending processing, classification, has become popular due to the growth of larger galaxy datasets.