Text retrieval - Groups

PROPSEGMENT

The PROPSEGMENT dataset is a large-scale corpus for proposition-level segmentation and entailment recognition.

Dataset
JSON

Sub-sentence encoder

The sub-sentence encoder is a contrastive learning framework for learning contextual embeddings for semantic units on the sub-sentence level.

Dataset
JSON

MIRFLICKR-25K

The MIRFLICKR-25K dataset consists of 25015 images and 223635 tags, where each image is associated with several textual tags and annotated with a 24-dimensional semantic label.

Dataset
JSON

NUS-WIDE

The dataset used in the paper is a multi-view clustering dataset, which contains 6 views of 30000 samples each. The dataset is used to evaluate the performance of the proposed...

Dataset
JSON

4 datasets found

PROPSEGMENT

Sub-sentence encoder

MIRFLICKR-25K

NUS-WIDE