Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement

Disentangled representation learning strives to extract the intrinsic factors within observed data. Factorizing these representations in an unsupervised manner is notably challenging and usually requires tailored loss functions or specific structural designs.

BibTex: