ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator

Fine-grained object discrimination using Vision Transformer

BibTex: