MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers

Text-based video editing using MaskINT, a two-stage pipeline involving keyframe joint editing and structure-aware frame interpolation.

BibTex: