RAMM: Retrieval-augmented Biomedical Visual Question Answering

A retrieval-augmented pretrain-and-finetune paradigm for biomedical VQA which includes a high-quality image-text pairs PMCPM, a pre-trained multi-modal model, and a novel retrieval-augmented attention module for fine-tuning.

Data and Resources

Cite this as

Zheng Yuan, Qiao Jin, Chuanqi Tan, Zhengyun Zhao, Hongyi Yuan, Fei Huang, Songfang Huang (2024). Dataset: RAMM: Retrieval-augmented Biomedical Visual Question Answering. https://doi.org/10.57702/fxt8j2bb

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2303.00534
Author Zheng Yuan
More Authors
Qiao Jin
Chuanqi Tan
Zhengyun Zhao
Hongyi Yuan
Fei Huang
Songfang Huang