SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer

Diffusion Transformer (DiT) has emerged as the new trend of generative diffusion models on image generation. In view of extremely slow convergence in typical DiT, recent breakthroughs have been driven by mask strategy that significantly improves the training efficiency of DiT with additional intra-image contextual learning.

BibTex: