-
VFX Segmentation Dataset
The VFX Segmentation Dataset is a high-resolution video segmentation dataset containing 27,046 video frames with pixel-level ground-truth segmentations. -
Actor and Action (A2D) Dataset
Actor and Action (A2D) dataset is a popular dataset for the actor and action video segmentation task. -
Real-Time All-Purpose Segment Anything Model
Advanced by transformer architecture, vision foundation models (VFMs) achieve remarkable progress in performance and generalization ability. Segment Anything Model (SAM) is one... -
Densely Annotated Video Segmentation (DAVIS)
The Davis dataset contains fifty high-resolution videos with pixel-accurate ground truth. -
Video Segmentation Datasets
The dataset used for video segmentation, including Moving MNIST, Change detection, Segtrack V2, and Davis datasets. -
Temporal Deepfake Segment Benchmark
A deepfake detection method that can address the issue of modifying segments of videos using generative techniques. -
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
Video segmentation requires consistently segmenting and tracking objects over time. Due to the quadratic dependency on input size, directly applying self-attention to video... -
Semantic Object Classes in Video
A dataset for semantic object classes in video. -
Breakfast Actions
The Breakfast Actions dataset contains 70 hours of cooking activities of varying complexity. It contains 10 different cooking tasks (with about 170 videos per task), which can... -
Youtube-Objects
Video object segmentation is challenging due to the factors like rapidly fast motion, cluttered backgrounds, arbitrary object appearance variation and shape deformation. -
DAVIS-17 Dataset
The DAVIS-17 dataset, which is a benchmark for video object segmentation. -
OMG-Seg Dataset
The dataset used for training and testing the OMG-Seg model, which includes COCO panoptic, COCO-SAM, VIPSeg, Youtube-VIS-2019, and Youtube-VIS-2021 datasets. -
Video k-net: A Simple, Strong, and Unified Baseline for Video Segmentation
Video k-net: A simple, strong, and unified baseline for video segmentation. -
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation
Video segmentation aims to segment and track every pixel in diverse scenarios accurately. This paper presents Tube-Link, a versatile framework that addresses multiple core tasks... -
Viper dataset
The Viper dataset is a visual perception benchmark to facilitate both low-level and high-level vision tasks, e.g., optical flow and semantic segmentation. It consists of videos...