Uni3DL: Unified Model for 3D and Language Understanding

doi:doi:10.57702/qhrzeun0

Uni3DL: Unified Model for 3D and Language Understanding

Uni3DL is a unified model for 3D and language understanding. It operates directly on point clouds and supports diverse 3D vision-language tasks, including semantic segmentation, object detection, instance segmentation, grounded segmentation, captioning, text-3D cross-modality retrieval, and zero-shot 3D object classification.

BibTex:

@dataset{Xiang_Li_and_Jian_Ding_and_Zhaoyang_Chen_and_Mohamed_Elhoseiny_2024,
    abstract = {Uni3DL is a unified model for 3D and language understanding. It operates directly on point clouds and supports diverse 3D vision-language tasks, including semantic segmentation, object detection, instance segmentation, grounded segmentation, captioning, text-3D cross-modality retrieval, and zero-shot 3D object classification.},
    author = {Xiang Li and Jian Ding and Zhaoyang Chen and Mohamed Elhoseiny},
    doi = {10.57702/qhrzeun0},
    institution = {No Organization},
    keyword = {'3D Vision-Language Understanding', 'Captioning', 'Grounded Segmentation', 'Instance Segmentation', 'Object Detection', 'Point Clouds', 'Semantic Segmentation', 'Text-3D Cross-Modality Retrieval'},
    month = {dec},
    publisher = {TIB},
    title = {Uni3DL: Unified Model for 3D and Language Understanding},
    url = {https://service.tib.eu/ldmservice/dataset/uni3dl--unified-model-for-3d-and-language-understanding},
    year = {2024}
}