Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training
Generating images from graph-structured inputs, such as scene graphs, is uniquely challenging due to the difficulty of aligning nodes and connections in graphs with objects and their relations in images.
BibTex: