-
VideoAttentionTarget
VideoAttentionTarget is a video-based gaze target dataset comprising 71,666 frames from 1,331 clips. -
GazeFollow
GazeFollow is a large-scale dataset consisting of 122,143 images with 130,339 annotations on head-target instances. -
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Gaze target detection aims to directly associate individuals and their gaze targets within a single image or across multiple video frames.