RVOS-D

RVOS-D provides more complex language descriptions from a broader object categories within relatively longer video sequences.

BibTex: