-
LLaMA-AdapterV2
LLaMA-AdapterV2: A parameter-efficient visual instruction model for text-image generation. -
M2Chat: Empowering VLM for Multimodal LLM Interleaved
M2Chat is a novel unified multimodal LLM framework for generating interleaved text-image conversation across various scenarios. -
Hand-drawn Symbol Recognition of Surgical Flowsheet Graphs with Deep Image Se...
The dataset used in this paper for hand-drawn symbol recognition of surgical flowsheet graphs with deep image segmentation. -
Multi-scale 3D Convolution Network for Video Based Person Re-Identification
Video based person ReID using a two-stream convolution network to explicitly leverage spatial and temporal cues. -
Contrails Detection Dataset
The dataset is used for aircraft contrail detection in global satellite images. -
PID Dataset
A comprehensive dataset for training deep learning algorithms for classifying different types of pavement distress. -
XCAT phantom
The dataset used for training the prior score model for diffusion posterior sampling in CT image reconstruction. -
Cervical Cancer Segmentation on Multiparametric MRI
A dataset of multiparametric MRI images of cervical cancer patients for segmentation and analysis. -
ImageNet VID
Video object detection dataset.