-
Multi-Modal CelebA-HQ
A large-scale face image dataset that contains real face images and corresponding semantic segmentation map, sketch, and textual descriptions. -
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Repr...
Generating motion from textual descriptions can be used in numerous applications in the game industry, film-making, and animating robots. For example, a typical way to access new... -
HumanAct12
HumanAct12 dataset is a large-scale 3D human motion dataset with textual descriptions.