-
Automatic Question-Answer Generation for Long-Tail Knowledge
Automatic Question-Answer Generation for Long-Tail Knowledge -
Off-Topic Memento Dataset
The dataset used to evaluate the effectiveness of different similarity measures for identifying off-topic mementos. -
Forgetting in Answer Set Programming
The dataset used in the paper is a set of answer set programs and their corresponding V-HT-models. -
ELI5, FinanceQA, MultiNews, and QMSum datasets
The ELI5, FinanceQA, MultiNews, and QMSum datasets were used in the paper. -
SearchSnippets
The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models. -
MS MARCO Passage Ranking (MARCO Dev Passage)
Dense retrieval (DR) has shown promising results in information retrieval. In essence, DR requires high-quality text representations to support effective search in the... -
Singapore Rapid Transit Systems Regulations
Singapore Rapid Transit Systems Regulations is a collection of regulations proclaimed by the Singapore government. -
Universal and transferable adversarial attacks on aligned language models
AdvBench is a dataset for evaluating the safety of large language models. -
Social Chemistry 101: Learning to reason about social and moral norms
Social Chemistry 101 is a dataset that encompasses diverse social norms. -
Aligning AI with shared human values
ETHICS is a benchmark for evaluating a language model's knowledge of fundamental ethical concepts. -
Crows-pairs: A challenge dataset for measuring social biases in masked langua...
CrowS-Pairs is a challenge dataset for measuring social biases in masked language models. -
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evalua...
ALI-Agent is an evaluation framework that leverages the autonomous abilities of LLM-powered agents to probe adaptive and long-tail risks in target LLMs. -
Opportunity activity recognition dataset
Opportunity activity recognition dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation. -
Disc-medllm
Disc-medllm: Bridging general large language models and real-world medical consultation. -
TruthX: Alleviating Hallucinations by Editing Large Language Models
TruthX: Alleviating Hallucinations by Editing Large Language Models -
Helpful and Harmless
The dataset used for training and evaluation of the proposed RRHF paradigm. -
Distance-based approaches to repair semantics in ontology-based data access
The dataset used in this paper is a set of repairs for a knowledge base, with each repair being a maximal R-consistent subset of facts.