Dataset - LDM

CAE v2: Context Autoencoder with CLIP Target

Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches. Applying the reconstruction supervision on the CLIP representation has been...
- Dataset
- JSON
Remote Sensing Scene Classification with Masked Image Modeling (MIM)

Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction,...
- Dataset
- JSON
MOCA: Masked Online Codebook Assignments prediction

Self-supervised representation learning for Vision Transformers (ViT) to mitigate the greedy needs of ViT networks for very large fully-annotated datasets.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found