Mono-ViFI: A Unified Framework for Self-supervised Monocular Depth Estimation

Self-supervised monocular depth estimation has gathered no-table interest since it can liberate training from dependency on depth annotations. In monocular video training case, recent methods only conduct view synthesis between existing camera views, leading to insuffi-cient guidance. To tackle this, we try to synthesize more virtual camera views by flow-based video frame interpolation (VFI), termed as tempo-ral augmentation.

BibTex: