Bioinformatics - Groups

Diabetes and Asia datasets

The Diabetes and Asia datasets were used for the experiments.
- Dataset
- JSON
HIV-1 protease cleavage dataset

The HIV-1 protease cleavage dataset is compiled from four data source files, with the primary purpose to develop effective protease cleavage inhibitors by predicting whether the...
- Dataset
- JSON
Chemical-Disease Relations (CDR) dataset

The Chemical-Disease Relations (CDR) dataset was built for the BioCreative V challenge and annotated with one relation "chemical-induced disease" manually.
- Dataset
- JSON
IEDB

TCR-epitope binding affinity prediction dataset
- Dataset
- JSON
McPAS

TCR-epitope binding affinity prediction dataset
- Dataset
- JSON
UniProtKB Human Gene binding prediction

UniProtKB Human Gene binding prediction
- Dataset
- JSON
IEDB weekly automated benchmark datasets

IEDB weekly automated benchmark datasets
- Dataset
- JSON
LGG Multi-omic Data

The dataset is a collection of multi-omic data from lower-grade glioma (LGG) tumor samples collected by the TCGA Research Network.
- Dataset
- JSON
Machine Learning and Bioinformatics for Diagnosis Analysis of Obesity Spectru...

The dataset used for diagnosis analysis of obesity spectrum disorders
- Dataset
- JSON
Alternative Splicing

Alternative Splicing is a dataset of RNA sequences used for predicting alternative gene splicing.
- Dataset
- JSON
SCOPe dataset

Structural Classiﬁcation of Proteins — extended (SCOPe) dataset
- Dataset
- JSON
Mutag dataset

Mutag dataset is a benchmark dataset for graph neural networks, containing 188 cancer and 67 non-cancer cells.
- Dataset
- JSON
Enzyme Structure Dataset

This dataset contains enzyme structures and their corresponding features.
- Dataset
- JSON
Protein Structure Dataset

This dataset contains protein structures and their corresponding features.
- Dataset
- JSON
Promoter Design

We use the promoter DNA sequence dataset containing 100k promoter sequences with the corresponding transcription initiation signal profiles.
- Dataset
- JSON
Binding MOAD

The dataset is used for compound-protein binding affinity prediction, and it contains 1963 compound-protein pairs.
- Dataset
- JSON
BindingDB

The dataset is used for compound-protein binding affinity prediction, and it contains 218,615 compound-protein pairs.
- Dataset
- JSON
Hetionet

The dataset used in the paper is a small drug-gene-disease network.
- Dataset
- JSON
DrugBank

The dataset used in the paper is a sparse user-item interaction matrix, a protein-protein similarity matrix, and a drug-drug similarity matrix.
- Dataset
- JSON
Benchmark datasets

The dataset used in the paper is a collection of small images, each representing a patch of a jigsaw puzzle. The patches are of the same size and orientation, and the goal is to...
- Dataset
- JSON

20 datasets found