Hybrid-S2S: Video Object Segmentation with Recurrent Networks and Correspondence Matching
One-shot Video Object Segmentation (VOS) is the task of pixel-wise tracking an object of interest within a video sequence, where the segmentation mask of the first frame is given at inference time.
BibTex: