Video Instance Segmentation

Video instance segmentation requires consistently segmenting and tracking objects over time. Due to the quadratic dependency on input size, directly applying self-attention to video instance segmentation with high-resolution input features poses significant challenges, often leading to insufficient GPU memory capacity.

BibTex: