The dataset used in the paper is not explicitly described, but it is mentioned that it is a large-scale captioned image dataset (LAION) used to train the Stable Diffusion model.
Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...