LLaVA 158k

The LLaVA 158k dataset is a large-scale multimodal learning dataset, which is used for training and testing multimodal large language models.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee (2024). Dataset: LLaVA 158k. https://doi.org/10.57702/99mwt6a0

DOI retrieved: December 16, 2024

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2406.10638
Author	Haotian Liu
More Authors	Chunyuan Li Yuheng Li Yong Jae Lee
Homepage	https://huggingface.co/datasets/LLaVA-158k