-
TPC-ViT: Token Propagation Controller for Efficient Vision Transformers
Vision transformers (ViTs) have achieved promising results on a variety of Computer Vision tasks, however their quadratic complexity in the number of input tokens has limited... -
SMMix: Self-Motivated Image Mixing for Vision Transformers
CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs). However, the inconsistency between the mixed... -
Vision Transformers for Dense Prediction
A dataset for vision transformers -
Query-guided Attention in Vision Transformers for Localizing Objects Using a ...
Sketch-based object localization in natural images, where given a crude hand-drawn sketch of an object, the goal is to localize all the instances of the same object on the... -
DINO dataset
The DINO dataset: A large-scale vision transformer dataset