An image is worth 16x16 words: Transformers for image recognition at scale

An image is worth 16x16 words: Transformers for image recognition at scale.

Data and Resources

Cite this as

Xu et al., Chen et al., Li et al., Zhou et al., Dosovitskiy et al., Unterthiner et al., Minderer et al., Heigold et al., Gelly et al. (2024). Dataset: An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.57702/ovifi3ii

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2304.01910
Citation
  • https://doi.org/10.48550/arXiv.2306.15111
Author Xu et al.
More Authors
Chen et al.
Li et al.
Zhou et al.
Dosovitskiy et al.
Unterthiner et al.
Minderer et al.
Heigold et al.
Gelly et al.
Homepage https://openreview.net/forum?id=YicbFdNTTy