RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation
Referring video object segmentation (RVOS) aims to accurately segment the target object in the video with the guidance of given language expressions.
BibTex: