Uni3DL: Unified Model for 3D and Language Understanding

doi:doi:10.57702/qhrzeun0

Uni3DL: Unified Model for 3D and Language Understanding

Uni3DL is a unified model for 3D and language understanding. It operates directly on point clouds and supports diverse 3D vision-language tasks, including semantic segmentation, object detection, instance segmentation, grounded segmentation, captioning, text-3D cross-modality retrieval, and zero-shot 3D object classification.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Xiang Li, Jian Ding, Zhaoyang Chen, Mohamed Elhoseiny (2024). Dataset: Uni3DL: Unified Model for 3D and Language Understanding. https://doi.org/10.57702/qhrzeun0

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2312.03026
Author	Xiang Li
More Authors	Jian Ding Zhaoyang Chen Mohamed Elhoseiny
Homepage	https://uni3dl.github.io/