10 datasets found

Tags: captions

Filter Results
  • MSCOCO 2014 Captions Dataset

    The MSCOCO 2014 captions dataset contains 123,293 images, split into a 82,783 image training set and a 40,504 image validation set. Each image is labeled with five...
  • MARIO-LAION

    The MARIO-LAION dataset is a subset of the LAION-400M dataset, containing 9,194,613 high-quality text images with corresponding captions.
  • Graphic Narrative Corpus

    The Graphic Narrative Corpus (GNC) is a dataset of annotated comic book pages, representing English-language graphic novels with a variety of styles.
  • 3DTopia-360K

    The 3DTopia-360K dataset is a large-scale 3D object dataset, which is used to train the 3DTopia model. The dataset contains 360K 3D objects with detailed captions.
  • Flickr30k

    The Flickr30k dataset is widely utilized for image caption and image-text retrieval tasks, providing a substantial collection of images with associated captions.
  • MSCOCO dataset

    The MSCOCO dataset is a large-scale image captioning dataset, containing 113,287 images with 5,000 validation images and 5,000 test images. The dataset is used for training and...
  • ActivityNet Captions

    The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...
  • MSR-VTT

    The dataset used in the paper is MSR-VTT, a large video description dataset for bridging video and language. The dataset contains 10k video clips with length varying from 10 to...
  • Objaverse

    The Objaverse dataset contains around 800k 3D objects. After adopting simple filter leveraging CLIP [27] to remove the objects whose rendered images are not relevant to its...
  • COCO

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
You can also access this registry using the API (see API Docs).