-
Simulated Medical Diagnosis Task
The dataset used in the paper is a simulated medical diagnosis task, where patients can lie about their symptoms, and the goal is to predict pregnancy based on self-reported... -
TruthfulQA
The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods. -
TruthX: Alleviating Hallucinations by Editing Large Language Models
TruthX: Alleviating Hallucinations by Editing Large Language Models