-
DMC4ML: Data Movement Complexity for Machine Learning
The dataset used in this paper for analyzing the memory cost of three machine learning algorithms: transformers, spatial convolution, and FFT. -
DOLFIN: DIFFUSION LAYOUT TRANSFORMERS WITHOUT AUTOENCODER
A novel generative model, Diffusion Layout Transformers without Autoencoder (Dolfin), which significantly improves the modeling capability with reduced complexity compared to... -
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transfo...
Semantic segmentation is a fundamental task in computer vision and enables many downstream applications. It is related to image classification since it produces per-pixel... -
PC-JeDi: Particle Cloud Jets with Diffusion
A new method to efficiently generate jets in High Energy Physics called PC-JeDi. This method utilises score-based diffusion models in conjunction with transformers which are... -
TransFusion
The TransFusion model is a robust LiDAR-camera fusion for 3D object detection with transformers. -
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Tran...
Mixup is a commonly adopted data augmentation technique for image classification. Recent advances in mixup methods primarily focus on mixing based on saliency. -
PointConvFormer: Revenge of the Point-based Convolution
PointConvFormer is a novel point cloud layer that combines ideas from point convolution and transformers. -
Cityscapes
The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and... -
An image is worth 16x16 words: Transformers for image recognition at scale
An image is worth 16x16 words: Transformers for image recognition at scale.