EscherNet: A Generative Model for Scalable View Synthesis
EscherNet is a multi-view conditioned diffusion model designed for scalable view synthesis. It leverages Stable Diffusion's 2D architecture empowered by the innovative Camera Positional Embedding (CaPE), EscherNet adeptly learns implicit 3D representations from varying number of reference views, achieving consistent 3D novel view synthesis.
BibTex: