-
Instruct-PTBR
The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common... -
Pt-Corpus-Instruct
The dataset used for training the TeenyTinyLlama pair consists of a concatenation of open-source Brazilian Portuguese datasets, including Wikipedia, CulturaX, OSCAR, Common... -
DEPS Dataset
The Describe, Explain, Plan and Select (DEPS) dataset is a collection of instruction-following tasks, focusing on describing, explaining, planning, and selecting actions. -
Goal Drift Dataset
The Goal Drift Dataset is a collection of triplets (current observation, goal imagination, instruction) from the OpenAI Contractor Gameplay Dataset, used to train the Imaginator. -
Stanford Alpaca
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used CIFAR-10 and CIFAR-100 datasets for image classification, and ImageNet-100...