M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition

Fine-grained visual recognition (FGVR) is a challenging task due to subtle inter-class differences and large intra-class variations.

BibTex: