-
Physicochemical Properties of Protein Tertiary Structure Data Set
Physicochemical Properties of Protein Tertiary Structure Data Set. UCI Machine Learning Repository, https://doi.org/10.24432/C5QW3H. -
Protein Structure Data from PDB
The dataset used in this study is a collection of protein structure data from the Protein Data Bank (PDB), which is used to infer the interactions between protein subunits. -
SE(3) Diffusion Model Dataset
The dataset used for training and testing the SE(3) diffusion model. -
Protein Data Bank
The Protein Data Bank contains 13,308 complexes of ligands bound to target proteins with structures determined by X-ray crystallography. -
SCOPe dataset
Structural Classification of Proteins — extended (SCOPe) dataset -
RAbD Benchmark
The RAbD dataset is a benchmark for antibody design, which is used to evaluate the performance of the proposed dyMEAN model. -
Structural Antibody Database (SAbDab)
The Structural Antibody Database (SAbDab) is a dataset of antibody structures, which is used for training and testing the proposed dyMEAN model. -
OmegaFold dataset
The OmegaFold dataset is used for protein structure prediction. -
ProteinMPNN dataset
The ProteinMPNN dataset is used for inverse folding and protein structure prediction. -
CATH dataset
The CATH dataset provides a de-duplicated set of protein structural folds spanning a wide range of functions. -
AlphaFoldDB
The dataset used in the paper for secondary structure-guided novel protein sequence generation with latent graph diffusion.