Diverse instance discovery: Vision-Transformer for instance-aware multi-label image recognition

Multi-label image recognition is a practical and challenging computer vision task. The authors propose a method to leverage the advantages of Transformer with long-range dependency modeling to circumvent the disadvantages of CNNs limited to local receptive fields.

BibTex: