You're currently viewing an old version of this dataset. To see the current version, click here.

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks. This paper introduces FocusCLIP, an enhancement for CLIP pretraining using a new ROI encoder. This encoder uses heatmaps to help the model focus on key image areas, improving performance.

Data and Resources

This dataset has no data

Cite this as

Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc Van Gool, Didier Stricker, Muhammad Zeshan Afzal (2024). Dataset: FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks. https://doi.org/10.57702/duh1gofn

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Author	Muhammad Saif Ullah Khan
More Authors	Muhammad Ferjad Naeem Federico Tombari Luc Van Gool Didier Stricker Muhammad Zeshan Afzal