5 datasets found

Tags: biomedical text

Filter Results
  • CMID, KUAKE-QIC, Intent-Merged

    Biomedical intent detection and named entity recognition datasets
  • JNLPBA, DDI, BC5CDR, NCBI-Disease, AnatEM

    Biomedical intent detection and named entity recognition datasets
  • BC5CDR

    The BC5CDR dataset consists of 1,500 PubMed articles, which has been separated into training set (500), development set (500), and test set (500). The dataset contains 15,935...
  • ProtST

    The ProtST dataset is a collection of protein sequences and their corresponding biomedical text descriptions.
  • OSCAR

    The OSCAR corpus is a multilingual web corpus that is used for pre-training large generative language models. It is a document-oriented corpus that is comparable in size and...
You can also access this registry using the API (see API Docs).