An image is worth 16x16 words: Transformers for image recognition at scale

doi:doi:10.57702/ovifi3ii

An image is worth 16x16 words: Transformers for image recognition at scale

An image is worth 16x16 words: Transformers for image recognition at scale.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Xu et al., Chen et al., Li et al., Zhou et al., Dosovitskiy et al., Unterthiner et al., Minderer et al., Heigold et al., Gelly et al. (2024). Dataset: An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.57702/ovifi3ii

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2304.01910
Citation	https://doi.org/10.48550/arXiv.2306.15111
Author	Xu et al.
More Authors	Chen et al. Li et al. Zhou et al. Dosovitskiy et al. Unterthiner et al. Minderer et al. Heigold et al. Gelly et al.
Homepage	https://openreview.net/forum?id=YicbFdNTTy