PAWS-X
PAWS-X is a cross-lingual adversarial dataset for paraphrase identification consisting of 23,659 human translated pairs in six languages (French, Spanish, German, Chinese, Japanese, and Korean). It aims to enhance multilingual research in paraphrase identification by providing challenging examples that have high word overlap but different semantic meanings.
BibTex: