Multilevel language and vision integration for text-to-clip retrieval

Multilevel language and vision integration for text-to-clip retrieval

BibTex: