Natural Language Processing - Groups

BIG-Bench Hard

The BIG-Bench Hard dataset is derived from the original BIG-Bench evaluation suite, focusing on tasks that pose challenges to existing language models.

Dataset
JSON

LongPile

LongPile is a diverse dataset derived from the Pile corpus.

Dataset
JSON

TruthfulQA

The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods.

Dataset
JSON

A general theoretical paradigm to understand learning from human preferences

The paper proposes a novel approach to aligning language models with human preferences, focusing on the use of preference optimization in reward-free RLHF.

Dataset
JSON

Llama: Open and efficient foundation language models

The LLaMA dataset is a large language model dataset used in the paper.

Dataset
JSON

Mixtral of Experts

The dataset used in the paper for instruction following task

Dataset
JSON

HONEST

HONEST is a fairness dataset specifically designed to assess LMs' outputs' hurtfulness.

Dataset
JSON

FAIRBELIEF

FAIRBELIEF is a language-agnostic analytical approach to capture and assess beliefs embedded in LMs.

Dataset
JSON

GMEG-wiki and GMEG-yahoo

The GMEG-wiki and GMEG-yahoo datasets are used to evaluate the proposed approach.

Dataset
JSON

BEA-2019

The Break-It-Fix-It (BIFI) framework has demonstrated strong results on learning to repair a broken program without any labeled examples.

Dataset
JSON

CoNLL-2014

The task of grammatical error correction (GEC) is to map an ungrammatical sentence xbad into a grammatical version of it, xgood.

Dataset
JSON

LM-Critic: Language Models for Unsupervised Grammatical Error Correction

Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive.

Dataset
JSON

GLUE benchmark

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank...

Dataset
JSON

Switchboard

Human speech data comprises a rich set of domain factors such as accent, syntactic and semantic variety, or acoustic environment.

Dataset
JSON

BERT: Pre-training of deep bidirectional transformers for language understanding

This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding.

Dataset
JSON

DailyDialog

The DailyDialog dataset is a large-scale multi-turn dialogue dataset, consisting of 10,000 conversations with 5 turns each.

Dataset
JSON

GLUE

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have...

Dataset
JSON

Interpreting Learned Feedback Patterns in Large Language Models

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a condensed representation of LLM activations obtained from sparse...

Dataset
JSON

18 datasets found