3 datasets found

Groups: Factuality Evaluation Organizations: No Organization

Filter Results
  • Factcheck-GPT

    Factcheck-GPT is an end-to-end fine-grained document-level fact-checking and correction of LLM output.
  • FEVER 2.0 dataset

    The FEVER 2.0 dataset is a collection of claims and evidence sentences for factuality evaluation.
  • FACTOR

    The dataset used in this paper is FACTOR, a benchmark for factuality evaluation of language models.