Dataset - LDM

FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos

Fine-grained adaptation of the popular CLIP model across multiple datasets.
- Dataset
- JSON
Devil in the Number: Towards Robust Multi-modality Data Filter

The dataset used in the paper is a web-scale dataset for training a vision-language model. The dataset contains text-image pairs, and the authors propose a novel filter to...
- Dataset
- JSON
CLIP

The CLIP model and its variants are becoming the de facto backbone in many applications. However, training a CLIP model from hundreds of millions of image-text pairs can be...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found