FFHQ
Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could be costly. Also, famous face datasets[7, 11, 13, 29] don’t have corresponding text captions, making it difficult to develop text conditional image generation models on these datasets.
BibTex: