Diffsound: Discrete diffusion model for text-to-sound generation

The dataset used in the paper is not explicitly described, but it is mentioned that the authors evaluate CatFlow on two molecular generation benchmarks: QM9 and ZINC250k.

BibTex: