-
Therapeutics Data Commons
Therapeutics Data Commons (TDC) is a machine learning dataset and task for drug discovery and development. -
SARS-CoV-2 3CL Protease Inhibitors and Antiviral Compounds
A dataset related to SARS-CoV-2 which contains effective compounds in vitro and in vivo. -
VADEERS dataset
The dataset D = {XS, XI, XB, YR, (cid:126)yG} consists of five parts, where XS ∈ R304×300 denotes drugs’ SMILES vector representations, XI ∈ R117×294 denotes drugs’ inhibition... -
SCAM Detective: Accurate Predictor of Small, Colloidally Aggregating Molecules
This dataset has no description
-
Combating small molecule aggregation with machine learning
Biological screens are plagued by false positive hits resulting from aggregation. A bespoke machine-learning tool to confidently and intelligibly flag such entities is disclosed. -
AIMEE dataset
The dataset used in this paper for identifying inhibitors against SARS-CoV-2 3CL protease. -
DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Sc...
Virtual screening, which identifies potential drugs from vast compound databases to bind with a particular protein pocket, is a critical step in AI-assisted drug discovery.... -
ChEMBL database
The dataset used in this paper is the ChEMBL database, which contains drugs/molecules and their binding information for proteins Lyn, Lck, and Src. -
PaccMannRL: Designing anticancer drugs from transcriptomic data via reinforce...
The dataset used in the paper is a collection of gene expression profiles from cancer cells, as well as a collection of bioactive small molecules. -
Protease dataset
A dataset of protease dataset containing molecules active against various protease in enzymatic assays from experimental pharmacology databases -
ENZYM SYNTHIE
The dataset used in the paper is a real-world dataset for drug discovery.