103 datasets found

Tags: text classification

Filter Results
  • LAION

    The dataset used in the paper is not explicitly described, but it is mentioned that it is a large-scale captioned image dataset (LAION) used to train the Stable Diffusion model.
  • GLUE

    Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have...
  • Elsevier OA CC-BY corpus

    The Elsevier OA CC-BY corpus dataset consists of 40,000 open-access articles from across Elsevier's journals, representing a diverse research discipline.
You can also access this registry using the API (see API Docs).