Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
Text-to-motion synthesis aims to generate 3D human motion that not only precisely reflects the textual description but reveals the motion details as much as possible.
BibTex: