The dataset used in the paper is a text-to-3D generation framework, named the LucidDreamer, to distill high-fidelity textures and shapes from pretrained 2D diffusion models.
The Objaverse dataset contains around 800k 3D objects. After adopting simple filter leveraging CLIP [27] to remove the objects whose rendered images are not relevant to its...