Container: A General-Purpose Building Block for Multi-Head Context Aggregation

Convolutional neural networks (CNNs) are ubiquitous in computer vision, with a myriad of effective and efficient variations. Recently, Transformers – originally introduced in natural language processing – have been increasingly adopted in computer vision.

BibTex: