Pali: A Jointly-Scaled Multilingual Language-Image Model

This paper proposes a method called Pali, which jointly scales visual and vision-language representation learning.

BibTex: