-
Nyu CTF Dataset
A scalable open-source benchmark dataset for evaluating LLMs in offensive security. -
An empirical evaluation of llms for solving offensive security challenges
An empirical evaluation of LLMs for solving offensive security challenges. -
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. This objective can be achieved from three aspects: (i) high... -
ConstraintChecker
ConstraintChecker is a plugin component for LLMs to handle the problem of explicit relational constraint in CSKB reasoning. -
Monitoring CIFs During Disasters Using LLMs
The dataset used in this paper for monitoring Critical Infrastructure Facilities (CIFs) during disasters using Large Language Models (LLMs). -
LLMs for Social Robotics
The dataset used in the paper is not explicitly described, but it is mentioned that the authors recreated three existing HRI studies with LLMs. -
Mixtral of Experts
The dataset used in the paper for instruction following task -
Llama 2-7B-80k
The dataset used in the paper for instruction following task -
Mistral 7b
The dataset used in the paper for instruction following task -
AlpacaEval 2.0
The dataset used in the paper for instruction following task -
Evol-Instruct-70k
The dataset used in the paper for in-context learning task -
MoralChoice
The MoralChoice survey dataset contains 1767 moral decision-making scenarios. Every moral scenario consists of a triple (context, action 1, action 2) and a set of auxiliary labels. -
Forbidden Question Dataset
The dataset used to evaluate the effectiveness of different jailbreak attack methods against LLMs. The dataset contains 160 forbidden questions with high diversity. -
Jailbreak Attack Dataset
The dataset used in the paper to evaluate the effectiveness of different jailbreak attack methods against Large Language Models (LLMs). -
Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and...
A dataset of 29 multi-party conversations between patients, their companions, and a social robot in a hospital. -
EM-Assist: Safe Automated ExtractMethod Refactoring with LLMs
EM-Assist is an automated refactoring tool that uses LLMs to generate refactoring suggestions and validates, enhances, and ranks them.