Semantic Textual Similarity
The STS benchmark (Cer et al., 2017) and SICK-Relatedness dataset (Marelli et al., 2014) respectively contain 8.6K and 9.8K labeled sentence pairs, the sizes of which are usually insufficient for training deep neural networks.
BibTex: