Cap2Aug: Caption guided Image to Image data Augmentation

Visual recognition in a low-data regime is challenging and often prone to overfitting. To mitigate this issue, several data augmentation strategies have been proposed. However, standard transformations, e.g., rotation, cropping, and flipping provide limited semantic variations. To this end, we propose Cap2Aug, an image-to-image diffusion model-based data augmentation strategy using image captions as text prompts.

Data and Resources

Cite this as

Aniket Roy, Anhsul Shah, Ketul Shah, Anirban Roy, Rama Chellappa (2024). Dataset: Cap2Aug: Caption guided Image to Image data Augmentation. https://doi.org/10.57702/rxzlufpr

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2212.05404
Author Aniket Roy
More Authors
Anhsul Shah
Ketul Shah
Anirban Roy
Rama Chellappa