RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in gener-ating high-quality images. Medical imaging data is fundamentally different to natural images, and the language used to succinctly capture relevant details in medical data uses a different, narrow but semantically rich, domain-specific vocabulary.
BibTex: