WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models

WanJuan: A comprehensive multimodal dataset for advancing English and Chinese large models.

Data and Resources

Cite this as

Conghui He, Wei Li, Zhenjiang Jin, Chao Xu, Bin Wang, Dahua Lin (2024). Dataset: WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models. https://doi.org/10.57702/eh9ai9h5

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2407.13773
Author Conghui He
More Authors
Wei Li
Zhenjiang Jin
Chao Xu
Bin Wang
Dahua Lin