IMAGINE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or embedding-level comparisons with the text references. This is different from human language processing, for which visual imagination often improves comprehension.
BibTex: