Kosmos-2: Grounding multimodal large language models to the world

Kosmos-2: Grounding multimodal large language models to the world.

BibTex: