PAWS-X

PAWS-X is a cross-lingual adversarial dataset for paraphrase identification consisting of 23,659 human translated pairs in six languages (French, Spanish, German, Chinese, Japanese, and Korean). It aims to enhance multilingual research in paraphrase identification by providing challenging examples that have high word overlap but different semantic meanings.

Data and Resources

Cite this as

Yinfei Yang, Yuan Zhang, Chris Tar, Jason Baldridge (2024). Dataset: PAWS-X. https://doi.org/10.57702/edrfwgir

DOI retrieved: November 25, 2024

Additional Info

Field Value
Created November 25, 2024
Last update November 25, 2024
Defined In https://doi.org/10.48550/arXiv.1908.11828
Author Yinfei Yang
More Authors
Yuan Zhang
Chris Tar
Jason Baldridge
Homepage https://github.com/google-research-datasets/paws