ALIGN

Scaling up visual and vision-language representation learning with noisy text supervision.

BibTex: