-
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Imag...
Image-text retrieval (ITR) is a task to retrieve the relevant images/texts, given the query from another modality. The conventional dense retrieval paradigm relies on encoding...