Video-LLaVA

Video-LLaVA: Learning united visual representation by alignment before projection.

BibTex: