8 datasets found

Tags: Brazilian Portuguese

Filter Results
  • Gpt4all-J

    The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common...
  • Instruct-PTBR

    The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common...
  • Pt-Corpus-Instruct

    The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common...
  • Pt-Corpus

    The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common...
  • PB-Br.v1

    The PB-Br.v1 corpus is a corpus of Brazilian Portuguese texts annotated with semantic roles.
  • PB-Br.v2

    The PB-Br.v2 corpus is a corpus of Brazilian Portuguese texts annotated with semantic roles.
  • PropBank.Br

    The PropBank.Br corpus is a corpus of Brazilian Portuguese texts annotated with semantic roles.
  • COCO dataset (Brazilian Portuguese)

    The dataset used for training the Brazilian Portuguese version of the GRIT model, a translation of the COCO dataset.
You can also access this registry using the API (see API Docs).