Training CLIP models on Data from Scientific Papers

Contrastive Language-Image Pretraining (CLIP) models are trained with datasets extracted from web crawls, which are of large quantity but limited quality. This paper explores whether limited amounts higher quality data in a specific domain improves the general performance of CLIP models.

Data and Resources

Cite this as

Calvin Metzger (2024). Dataset: Training CLIP models on Data from Scientific Papers. https://doi.org/10.57702/vaozuc3r

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2311.04711
Author Calvin Metzger
Homepage https://github.com/nopperl/clip_arxiv_pmc