Self-supervised Video-centralised Transformer for Video Face Clustering

A self-supervised video-centralised transformer for video face clustering.

BibTex: