Break-A-Scene: Extracting Multiple Concepts from a Single Image

The dataset is created by augmenting a single input image with masks that indicate the presence of target concepts. The masks can be provided by the user or generated automatically by a pre-trained segmentation model.

BibTex: