-
MOT17 half-half
The proposed architecture is described in Fig. 3 which is based on TransTrack [21] with several improvements to reduce the computational complexity and model size. -
CrowdHuman
CrowdHuman is a challenging benchmark to evaluate the ability of crowded scene detection of detectors, which contains about 15k training images and 4k images for evaluation.