SMMix: Self-Motivated Image Mixing for Vision Transformers

CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs). However, the inconsistency between the mixed images and the corresponding labels harms its effectiveness. Existing CutMix variants tackle this problem by generating more consistent mixed images or more precise mixed labels, but inevitably introduce heavy training overhead or require extra information, undermining ease of use.

Data and Resources

Cite this as

Mengzhao Chen, Mingbao Lin, Zhihang Lin, Yuxin Zhang, Fei Chao, Rongrong Ji (2024). Dataset: SMMix: Self-Motivated Image Mixing for Vision Transformers. https://doi.org/10.57702/clz8l3fh

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2212.12977
Author Mengzhao Chen
More Authors
Mingbao Lin
Zhihang Lin
Yuxin Zhang
Fei Chao
Rongrong Ji
Homepage https://github.com/ChenMnZ/SMMix