-
SMMix: Self-Motivated Image Mixing for Vision Transformers
CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs). However, the inconsistency between the mixed... -
Vision Transformers for Dense Prediction
A dataset for vision transformers -
Query-guided Attention in Vision Transformers for Localizing Objects Using a ...
Sketch-based object localization in natural images, where given a crude hand-drawn sketch of an object, the goal is to localize all the instances of the same object on the...