-
Multimodal CT image-report dataset
The dataset contains organ-level vision-text pairs from 17,702 patients across 104 organs. -
CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans
Medical Vision-Language Pretraining (Med-VLP) establishes a connection between visual content from medical images and the relevant textual descriptions. Existing Med-VLP methods...