InstructBLIP

You're currently viewing an old version of this dataset. To see the current version, click here.

The InstructBLIP dataset is a vision-language model for comprehensive scene understanding and textual descriptions.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong (2024). Dataset: InstructBLIP. https://doi.org/10.57702/euv6djqk

DOI retrieved: December 16, 2024

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2308.02862
Citation	https://doi.org/10.48550/arXiv.2311.13939 https://doi.org/10.48550/arXiv.2403.14003
Author	Wenliang Dai
More Authors	Junnan Li Dongxu Li Anthony Meng Huat Tiong
Homepage	https://arxiv.org/abs/2304.10592