-
BIG-Bench Hard
The BIG-Bench Hard dataset is derived from the original BIG-Bench evaluation suite, focusing on tasks that pose challenges to existing language models. -
TruthfulQA
The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods. -
A general theoretical paradigm to understand learning from human preferences
The paper proposes a novel approach to aligning language models with human preferences, focusing on the use of preference optimization in reward-free RLHF. -
Llama: Open and efficient foundation language models
The LLaMA dataset is a large language model dataset used in the paper. -
Mixtral of Experts
The dataset used in the paper for instruction following task -
FAIRBELIEF
FAIRBELIEF is a language-agnostic analytical approach to capture and assess beliefs embedded in LMs. -
GMEG-wiki and GMEG-yahoo
The GMEG-wiki and GMEG-yahoo datasets are used to evaluate the proposed approach. -
CoNLL-2014
The task of grammatical error correction (GEC) is to map an ungrammatical sentence xbad into a grammatical version of it, xgood. -
LM-Critic: Language Models for Unsupervised Grammatical Error Correction
Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive. -
GLUE benchmark
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank... -
Switchboard
Human speech data comprises a rich set of domain factors such as accent, syntactic and semantic variety, or acoustic environment. -
BERT: Pre-training of deep bidirectional transformers for language understanding
This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding. -
DailyDialog
The DailyDialog dataset is a large-scale multi-turn dialogue dataset, consisting of 10,000 conversations with 5 turns each. -
Interpreting Learned Feedback Patterns in Large Language Models
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a condensed representation of LLM activations obtained from sparse...