Multimodal Learning - Groups

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese...

WanJuan: A comprehensive multimodal dataset for advancing English and Chinese large models.

Dataset
JSON

Visual instruction tuning

Visual instruction tuning.

Dataset
JSON

Flamingo: a visual language model for few-shot learning

Flamingo: a visual language model for few-shot learning.

Dataset
JSON

Audio-visual scene-aware dialog

Audio-visual scene-aware dialog.

Dataset
JSON

ChatBridge

ChatBridge is a multimodal language model capable of perceiving real-world multimodal information, as well as following instructions, thinking, and interacting with humans in...

Dataset
JSON

ShapeNeRF–Text

The ShapeNeRF–Text dataset consists of 40K paired NeRFs and language annotations for ShapeNet objects.

Dataset
JSON

CSL

The CSL dataset is a large-scale Chinese scientific literature dataset obtained from the "Qianyan" open-source NLP platform. It consists of 396,209 Chinese core journal papers'...

Dataset
JSON

Training CLIP models on Data from Scientific Papers

Contrastive Language-Image Pretraining (CLIP) models are trained with datasets extracted from web crawls, which are of large quantity but limited quality. This paper explores...

Dataset
JSON

COCO

Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...

Dataset
JSON

9 datasets found