-
SemEval-2016 task 1
The SemEval-2016 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation dataset was used to create the Common Voice Spoken Sentence Similarity... -
STS12, STS13, STS14, STS15, STS16, STSb, SICK-R
The STS12, STS13, STS14, STS15, STS16, STSb, SICK-R datasets contain sentence pairs from various sources. -
SICK-Relatedness
The SICK-Relatedness dataset contains 1,000 sentence pairs from the categories of captions, news, and forums. -
STS benchmark
The STS benchmark dataset contains 8,628 sentence pairs from the categories of captions, news, and forums.