-
Break-A-Scene: Extracting Multiple Concepts from a Single Image
The dataset is created by augmenting a single input image with masks that indicate the presence of target concepts. The masks can be provided by the user or generated... -
Localized Narratives-COCO-5K
The dataset used for training and evaluation of the W¨urstchen model.