GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks
Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.
BibTex: