You're currently viewing an old version of this dataset. To see the current version, click here.

Fine-tuned CLIP Models are Efficient Video Learners

This work explores the capability of a simple baseline called ViFi-CLIP (Video Fine-tuned CLIP) for adapting image-based CLIP to video domain.

Data and Resources

This dataset has no data

Cite this as

Hanoona Rasheed, Muhammad Uzair Khattak, Salman Khan, Fahad Shahbaz Khan (2024). Dataset: Fine-tuned CLIP Models are Efficient Video Learners. https://doi.org/10.57702/nuu1d0jy

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2212.03640
Author Hanoona Rasheed
More Authors
Muhammad Uzair Khattak
Salman Khan
Fahad Shahbaz Khan
Homepage https://github.com/muzairkhattak/ViFi-CLIP