Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering
Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs).
BibTex: