ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models

doi:doi:10.57702/qtfigx6j

You're currently viewing an old version of this dataset. To see the current version, click here.

ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models

ART•V is an efficient framework for auto-regressive video generation with diffusion models. It generates a single frame at a time, conditioned on the previous ones.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Wenming Weng, Ruoyu Feng, Yanhui Wang, Qi Dai, Chunyu Wang, Dacheng Yin, Zhiyuan Zhao, Kai Qiu, Jianmin Bao, Yuhui Yuan, Chong Luo, Yueyi Zhang, Zhiwei Xiong (2024). Dataset: ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models. https://doi.org/10.57702/qtfigx6j

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2311.18834
Author	Wenming Weng
More Authors	Ruoyu Feng Yanhui Wang Qi Dai Chunyu Wang Dacheng Yin Zhiyuan Zhao Kai Qiu Jianmin Bao Yuhui Yuan Chong Luo Yueyi Zhang Zhiwei Xiong
Homepage	https://warranweng.github.io/art.v