-
Long-term Leap Attention, Short-term Periodic Shift for Video Classification
Video transformer naturally incurs a heavier computation burden than a static vision transformer, as the former processes T times longer sequence than the latter under the... -
AccidentBlip2
A multimodal large language model for accident detection with multi-view motion reasoning