The ChEMBL dataset is a large collection of bioactive molecules, with over 10 million molecules, that can be used for various machine learning tasks, including molecular design.
A dataset of small molecules for benchmarking molecule generation methods. The dataset consists of fingerprints of the molecules, and the goal is to predict the original...