Dataset - LDM

Mcule database

The Mcule database is a collection of purchasable compounds within known stock amounts. The network has 6,369,219 nodes and 8,808,841 edges.
- Dataset
- JSON
PubChem

The PubChem database is a comprehensive repository of chemical compounds, with each compound having a unique identifier.
- Dataset
- JSON
BindingDB

The dataset is used for compound-protein binding affinity prediction, and it contains 218,615 compound-protein pairs.
- Dataset
- JSON
ZINC250K

The ZINC250K dataset is a large dataset of molecules used for molecular design and generation. It contains 250,000 molecules with their corresponding properties and structures.
- Dataset
- JSON
DrugBank

The dataset used in the paper is a sparse user-item interaction matrix, a protein-protein similarity matrix, and a drug-drug similarity matrix.
- Dataset
- JSON
ChEMBL

The ChEMBL dataset is a large collection of bioactive molecules, with over 10 million molecules, that can be used for various machine learning tasks, including molecular design.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

6 datasets found