You're currently viewing an old version of this dataset. To see the current version, click here.

Fine-tuned CLIP Models are Efficient Video Learners

This work explores the capability of a simple baseline called ViFi-CLIP (Video Fine-tuned CLIP) for adapting image-based CLIP to video domain.

Data and Resources

This dataset has no data

Cite this as

Hanoona Rasheed, Muhammad Uzair Khattak, Salman Khan, Fahad Shahbaz Khan (2024). Dataset: Fine-tuned CLIP Models are Efficient Video Learners. https://doi.org/10.57702/nuu1d0jy

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field	Value
Created	December 3, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.2212.03640
Author	Hanoona Rasheed
More Authors	Muhammad Uzair Khattak Salman Khan Fahad Shahbaz Khan
Homepage	https://github.com/muzairkhattak/ViFi-CLIP