6 datasets found

Formats: JSON Tags: protein sequence

Filter Results
  • UniprotKB/SwissProt

    The UniprotKB/SwissProt database contains protein sequence information.
  • ProtST

    The ProtST dataset is a collection of protein sequences and their corresponding biomedical text descriptions.
  • UniProt dataset

    The UniProt dataset is a comprehensive protein dataset. We download reviewed protein sequences (550k) with the limitation of 100 in length as D_r (57k examples). Then we use a...
  • DeepSF dataset

    The DeepSF dataset is a benchmark for protein sequence analysis.
  • Pfam protein families database

    The Pfam protein families database in 2019. The dataset is used for protein sequence analysis and contains 31 million protein domains.
  • AMMA dataset

    The dataset used in the paper for protein representation learning, consisting of 120k sequence, structure, and function triplets.
You can also access this registry using the API (see API Docs).