Dataset Groups Activity Stream Chinese CLIP A vision-language pre-training dataset, Chinese CLIP, which consists of 100 million image-text pairs. BibTex: @dataset{Alec_Radford_and_Jong_Wook_Kim_and_Chris_Hallacy_and_Aditya_Ramesh_and_Gabriel_Goh_and_Sandhini_Agarwal_and_Girish_Sastry_and_Amanda_Askell_and_Pamela_Mishkin_and_Jack_Clark_2024, abstract = {A vision-language pre-training dataset, Chinese CLIP, which consists of 100 million image-text pairs.}, author = {Alec Radford and Jong Wook Kim and Chris Hallacy and Aditya Ramesh and Gabriel Goh and Sandhini Agarwal and Girish Sastry and Amanda Askell and Pamela Mishkin and Jack Clark}, doi = {10.57702/z829yik2}, institution = {No Organization}, keyword = {'multimodal dataset', 'pre-training', 'vision-language'}, month = {dec}, publisher = {TIB}, title = {Chinese CLIP}, url = {https://service.tib.eu/ldmservice/dataset/chinese-clip}, year = {2024} }