UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models
Video Diffusion Models have been developed for video generation, usually integrating text and image conditioning to enhance control over the generated content.
BibTex: