-
Stanford Multi-turn, Multi-domain Dialogue Dataset
The Stanford Multi-turn, Multi-domain Dialogue Dataset is a dataset for language understanding in task-oriented dialogue systems. It contains a large number of training... -
Airline Travel Information System dataset (ATIS)
The Airline Travel Information System dataset (ATIS) is a dataset for language understanding in task-oriented dialogue systems. It contains 4978 training utterances from Class A... -
Ref-DAVIS17
Ref-DAVIS17 is an extension of the DAVIS17 dataset, where it enhances the dataset by providing language descriptions for each specific object present in the videos. -
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Ob...
Referring video object segmentation (RVOS) aims to accurately segment the target object in the video with the guidance of given language expressions. -
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
A Large Video Description Dataset for Bridging Video and Language. -
Ref-Youtube-VOS
Ref-Youtube-VOS is an extensive referring video object segmentation dataset that comprises approximately 15,000 referring expressions associated with more than 3,900 videos. -
BERT: Pre-training of deep bidirectional transformers for language understanding
This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding.