-
Self-Recognition in Language Models
A self-recognition test for language models using model-generated security questions. -
Latent Distance Guided Alignment Training for Large Language Models
Ensuring alignment with human preferences is a crucial characteristic of large language models (LLMs). Presently, the primary alignment methods, RLHF and DPO, require extensive... -
A general theoretical paradigm to understand learning from human preferences
The paper proposes a novel approach to aligning language models with human preferences, focusing on the use of preference optimization in reward-free RLHF. -
A general language assistant as a laboratory for alignment
A general language assistant for aligning language models with human users -
Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture -
Experimental Results
The authors evaluate the performance of their proposed conformal prediction methods for multistep feedback covariate shift (MFCS) on synthetic black-box optimization and active... -
Universal Conceptual Cognitive Annotation (UCCA)
The Universal Conceptual Cognitive Annotation (UCCA) dataset is a graph-based semantic annotation scheme based on typological linguistic principles. -
TruthX: Alleviating Hallucinations by Editing Large Language Models
TruthX: Alleviating Hallucinations by Editing Large Language Models -
Language models are few-shot learners
A language model that demonstrates capabilities in processing and generating human-like text. -
Natural Questions
The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer. -
MS MARCO: A Human-Generated Machine Reading Comprehension Dataset
The dataset is used for training and evaluating the MS MARCO model, a question answering model. -
LaMini: A Large-Scale Instruction Dataset
The LaMini approach involves generating a large-scale instruction dataset by leveraging the outputs of a large language model, gpt-3.5-turbo.