-
Visual Storytelling Dataset (VIST)
The Visual Storytelling Dataset (VIST) consists of 10,117 Flickr albums and 210,819 unique images. Each sample is one sequence of 5 photos selected from the same album paired... -
StorySalon
The StorySalon dataset is a large-scale dataset of storybooks with diverse characters, storylines, and artistic styles.