You're currently viewing an old version of this dataset. To see the current version, click here.

InstructBLIP

The InstructBLIP dataset is a vision-language model for comprehensive scene understanding and textual descriptions.

Data and Resources

Cite this as

Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong (2024). Dataset: InstructBLIP. https://doi.org/10.57702/euv6djqk

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2308.02862
Citation
  • https://doi.org/10.48550/arXiv.2311.13939
  • https://doi.org/10.48550/arXiv.2403.14003
Author Wenliang Dai
More Authors
Junnan Li
Dongxu Li
Anthony Meng Huat Tiong
Homepage https://arxiv.org/abs/2304.10592