Dataset - LDM

The AMI Meeting Corpus: A Multimodal Corpus for Meeting Transcription

The AMI Meeting Corpus is a multimodal corpus containing audio and video recordings of meetings.
- Dataset
- JSON
EasyCom: An Augmented Reality Dataset for Easy Communication in Noisy Environ...

The EasyCom dataset is a relatively new dataset, recorded using Meta’s Augmented-Reality (AR) glasses set.
- Dataset
- JSON
RWTH-PHOENIX-Weather

Continuous sign language recognition (SLR) deals with unaligned video-text pair and uses the word error rate (WER), i.e., edit distance, as the main evaluation metric.
- Dataset
- JSON
ShapeNeRF–Text

The ShapeNeRF–Text dataset consists of 40K paired NeRFs and language annotations for ShapeNet objects.
- Dataset
- JSON
MSVD-QA

The MSVD-QA dataset is a benchmark for video question answering, containing 1,970 videos with multiple-choice questions.
- Dataset
- JSON
TGIF-QA

The TGIF-QA dataset consists of 165165 QA pairs chosen from 71741 animated GIFs. To evaluate the spatiotemporal reasoning ability at the video level, TGIF-QA dataset designs...
- Dataset
- JSON
BraTS2020

The BraTS2020 dataset is a widely-used benchmark for brain tumor segmentation. It contains 369 glioma patient samples with two glioma grades (LGG and HGG).
- Dataset
- JSON
BraTS2018

The BraTS2018 database is a continually evolving database with a total of 285 glioblastoma or low-grade gliomas subjects, comprising three consecutive subsets, i.e., 30 subjects...
- Dataset
- JSON
RedCaps

The dataset used in the paper for training and evaluation of the MERU method.
- Dataset
- JSON
Datacomp

The dataset used in the paper for training and evaluation of the HYPE method.
- Dataset
- JSON
IMAGINE: An Imagination-Based Automatic Evaluation Metric for Natural Languag...

Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or embedding-level comparisons with the text references. This is different from...
- Dataset
- JSON
Multimodal Variational Autoencoder for Cardiac Hemodynamics Instability Detec...

A multimodal variational autoencoder for low-cost cardiac hemodynamics instability detection from CXR and ECG.
- Dataset
- JSON
CSL

The CSL dataset is a large-scale Chinese scientific literature dataset obtained from the "Qianyan" open-source NLP platform. It consists of 396,209 Chinese core journal papers'...
- Dataset
- JSON
PMC-CLIP

PMC-CLIP: Contrastive language-image pre-training using biomedical documents.
- Dataset
- JSON
BioMedClip

BioMedClip: A CLIP model pretrained on image-text pairs extracted from PubMed Central repository.
- Dataset
- JSON
Training CLIP models on Data from Scientific Papers

Contrastive Language-Image Pretraining (CLIP) models are trained with datasets extracted from web crawls, which are of large quantity but limited quality. This paper explores...
- Dataset
- JSON
Conceptual Captions

The dataset used in the paper "Scaling Laws of Synthetic Images for Model Training". The dataset is used for supervised image classification and zero-shot classification tasks.
- Dataset
- JSON
SSv2-Small

Few-shot image classification is a research area that focuses on identifying new classes with a small number of samples.
- Dataset
- JSON
CLIP-guided Prototype Modulating for Few-shot Action Recognition

Few-shot action recognition is a promising direction to alleviate the data labeling problem, which aims to identify unseen classes with a few labeled videos.
- Dataset
- JSON
Conceptual 12m

Conceptual 12m dataset for automatic image captioning
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

45 datasets found