CLIP

doi:doi:10.57702/mxw4bsuu

CLIP

The CLIP model and its variants are becoming the de facto backbone in many applications. However, training a CLIP model from hundreds of millions of image-text pairs can be prohibitively expensive.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark (2024). Dataset: CLIP. https://doi.org/10.57702/mxw4bsuu

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2306.00301
Citation	https://doi.org/10.1109/ICCV51070.2023.01403 https://doi.org/10.48550/arXiv.2312.06716 https://doi.org/10.48550/arXiv.2305.05095
Author	A. Radford
More Authors	J. W. Kim C. Hallacy A. Ramesh G. Goh S. Agarwal G. Sastry A. Askell P. Mishkin J. Clark
Homepage	https://arxiv.org/abs/2106.07637