-
Visual Storytelling Dataset (VIST)
The Visual Storytelling Dataset (VIST) consists of 10,117 Flickr albums and 210,819 unique images. Each sample is one sequence of 5 photos selected from the same album paired... -
Visual Story-Telling dataset (VIST)
Visual Story-Telling dataset (VIST) is the only publicly accessible dataset for storytelling problems. It comprises 210,819 distinct images that can be found in 10,117 different...