Dataset - LDM

SemEval-2016 task 1

The SemEval-2016 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation dataset was used to create the Common Voice Spoken Sentence Similarity...
- Dataset
- JSON
STS12, STS13, STS14, STS15, STS16, STSb, SICK-R

The STS12, STS13, STS14, STS15, STS16, STSb, SICK-R datasets contain sentence pairs from various sources.
- Dataset
- JSON
SICK-Relatedness

The SICK-Relatedness dataset contains 1,000 sentence pairs from the categories of captions, news, and forums.
- Dataset
- JSON
STS benchmark

The STS benchmark dataset contains 8,628 sentence pairs from the categories of captions, news, and forums.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

4 datasets found