1 dataset found

Tags: generative tasks

Filter Results
  • VALOR-BENCH

    VALOR-BENCH is a comprehensive human-annotated dataset covering hallucinations in large vision-language models, with a focus on measuring hallucinations in generative tasks.