Language Model - Groups

GPT-3 dataset

The dataset used in the paper is the GPT-3 dataset, which is a large language model output.
- Dataset
- JSON
Collective Constitutional AI

A platform for aligning a language model with public input.
- Dataset
- JSON
Phi-2: A Dataset for Language Model Evaluation

The Phi-2 dataset is a collection of language models used to evaluate the performance of language models.
- Dataset
- JSON
MBPP: A Dataset for Language Model Evaluation

The MBPP dataset is a collection of basic programming questions used to evaluate the performance of language models.
- Dataset
- JSON
LLaMA

The dataset used in the paper is LLaMA, a large language model.
- Dataset
- JSON
Wikipedia Corpus

The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
- Dataset
- JSON
BERT

The dataset used in this paper is a pre-trained BERT model trained on English Wikipedia and Books datasets.
- Dataset
- JSON
Direct preference optimization: Your language model is secretly a reward model

The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used a language model to optimize the performance of a reinforcement...
- Dataset
- JSON
Falcon 7B

This dataset has no description
- Dataset
- JSON
RedPajama 3B

This dataset has no description
- Dataset
- JSON

10 datasets found