-
Gemma: Open models based on gemini research and technology
This dataset contains a large corpus of text for training and evaluating large language models. -
Llama 2: Open foundation and fine-tuned chat models
This dataset contains a large corpus of text for training and evaluating large language models. -
Buffer of Thoughts
Buffer of Thoughts is a novel and versatile thought-augmented reasoning approach for enhancing accuracy, efficiency and robustness of large language models (LLMs). -
Reducing Retraining by Recycling Parameter-Efficient Prompts
Parameter-efficient methods are able to use a single frozen pre-trained large language model to perform many tasks by learning task-specific soft prompts that modulate model... -
TruthfulQA
The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods. -
Evaluating large language models trained on code
The paper presents the results of the OpenAI Codex evaluation on generating Python code. -
Confidence Calibration in Large Language Models
The dataset used in this study to analyze the self-assessment behavior of Large language models. -
Proof-Pile-2
The dataset used for continual pre-training of large language models, with a focus on balancing the text distribution and mitigating overfitting. -
Hate Speech Detection using Large Language Models
The dataset used for probing LLMs for hate speech detection, including HateXplain, implicit hate, and ToxicSpans datasets. -
TruthX: Alleviating Hallucinations by Editing Large Language Models
TruthX: Alleviating Hallucinations by Editing Large Language Models -
Orca: Progressive Learning from Complex Explanation Traces
The Orca approach involves leveraging explanation tuning to generate detailed responses from a large language model. -
Evol-Instruct: A Pipeline for Automatically Evolving Instruction Datasets
The Evol-Instruct pipeline involves automatically evolving instruction datasets using large language models.