The Reuters Video-Language News Dataset (ReutersViLNews) is a large-scale video-language understanding dataset containing 1,974 long-form news videos with an average video length of 91.2 seconds.
BibTex:
Before browse our site, please accept our cookies policy