CLIP-GLaSS

The dataset used for the text-to-image task consists of 20 context tokens, to which three fixed tokens have been concatenated, representing the static context "the picture of".

Data and Resources

Cite this as

Federico Galatolo, Mario Cimino, Gigliola Vaglini (2024). Dataset: CLIP-GLaSS. https://doi.org/10.57702/i6xqwwu8

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Author Federico Galatolo
More Authors
Mario Cimino
Gigliola Vaglini
Homepage https://github.com/galatolofederico/clip-glass