Of human criteria and automatic metrics: A benchmark of the evaluation of story generation

The HANNA dataset contains 1056 creative story writings generated from 96 prompts collected from WritingPrompt.

BibTex: