-
FIT-AR: Far-reaching Interleaved Transformers for Autoregressive Modeling
We introduce FIT-AR, a variant of FIT that incorporates causal masks and shifting in cross-attention. -
Visual AutoRegressive modeling (VAR)
Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine “next-scale prediction” or...