You're currently viewing an old version of this dataset. To see the current version, click here.

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks. This paper introduces FocusCLIP, an enhancement for CLIP pretraining using a new ROI encoder. This encoder uses heatmaps to help the model focus on key image areas, improving performance.

Data and Resources

This dataset has no data

Cite this as

Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc Van Gool, Didier Stricker, Muhammad Zeshan Afzal (2024). Dataset: FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks. https://doi.org/10.57702/duh1gofn

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Author Muhammad Saif Ullah Khan
More Authors
Muhammad Ferjad Naeem
Federico Tombari
Luc Van Gool
Didier Stricker
Muhammad Zeshan Afzal