Dataset - LDM

PROPSEGMENT

The PROPSEGMENT dataset is a large-scale corpus for proposition-level segmentation and entailment recognition.
- Dataset
- JSON
Sub-sentence encoder

The sub-sentence encoder is a contrastive learning framework for learning contextual embeddings for semantic units on the sub-sentence level.
- Dataset
- JSON
MIRFLICKR-25K

The MIRFLICKR-25K dataset consists of 25015 images and 223635 tags, where each image is associated with several textual tags and annotated with a 24-dimensional semantic label.
- Dataset
- JSON
NUS-WIDE

The dataset used in the paper is a multi-view clustering dataset, which contains 6 views of 30000 samples each. The dataset is used to evaluate the performance of the proposed...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

4 datasets found