-
Contrastive Multiple Instance Learning for Weakly Supervised Person ReID
The acquisition of large-scale, precisely labeled datasets for person re-identification (ReID) poses a significant challenge. Weakly supervised ReID has begun to address this... -
Towards Unsupervised Domain Generalization
Unsupervised domain generalization (UDG) aims to learn generalizable models with unlabeled data and analyze the effects of pre-training on DG. -
Pali: A Jointly-Scaled Multilingual Language-Image Model
This paper proposes a method called Pali, which jointly scales visual and vision-language representation learning. -
FLIP: A Method for Reducing Computation in Contrastive Language-Image Pre-tra...
This paper proposes a method called FLIP, which masks half or more patches of the training images to reduce computation by 2x and allow for the use of larger batch sizes. -
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Trai...
This paper investigates the performance of the Contrastive Language-Image Pre-training (CLIP) when scaled down to limited computation budgets. -
Mixed Supervised Graph Contrastive Learning for Recommendation
Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn... -
Prototypical Contrastive Learning
Self-supervised learning of pretext-invariant representations. -
UMIC: An unreferenced metric for image captioning via contrastive learning
UMIC: An unreferenced metric for image captioning via contrastive learning -
Few-shot single-view 3D reconstruction with memory prior contrastive network
Few-shot single-view 3D reconstruction with memory prior contrastive network -
Motion-Focused Contrastive Learning of Video Representations
Motion-focused Contrastive Learning (MCL) method for self-supervised video representation learning. -
YFCC15M-V2
The dataset is used for Contrastive Language-Image Pretraining (CLIP) and its variants. -
YFCC15M-V1
The dataset is used for Contrastive Language-Image Pretraining (CLIP) and its variants. -
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations. -
CLAMNET: Using Contrastive Learning with Variable Depth Unets for Medical Ima...
The dataset used in this paper is medical images from various sources including magnetic resonance imaging (MRI) and computed tomography (CT), without the need for pixel-wise... -
i-mix: A domain-agnostic strategy for contrastive representation learning
A simple framework for contrastive learning of visual representations. -
Contrastive Learning of Person-independent Representations for Facial Action ...
Contrastive learning of person-independent representations for facial action unit detection -
DEMYSTIFYING CLIP DATA
Contrastive Language-Image Pre-training (CLIP) is an approach that has advanced research and applications in computer vision, fueling modern recognition systems and generative... -
Open CLIP H14
The Open CLIP H14 dataset is a benchmark for material recognition using contrastive learning with physics-based rendering.