Hallucination Evaluation - Groups

HallusionBench

HallusionBench is an advanced diagnostic suite for entangled language hallucination and visual illusion in large vision-language models.

Dataset
JSON

VALOR-BENCH

VALOR-BENCH is a comprehensive human-annotated dataset covering hallucinations in large vision-language models, with a focus on measuring hallucinations in generative tasks.

Dataset
JSON

HaluEval-Sum

The dataset used in this paper is HaluEval-Sum, a large-scale hallucination evaluation benchmark for large language models.

Dataset
JSON

3 datasets found

HallusionBench

VALOR-BENCH

HaluEval-Sum