LM-Extraction benchmark

The LM-Extraction benchmark is derived from The Pile (Gao et al., 2020) dataset, which contains 15,000 pairs of prefixes and suffixes derived from The Pile dataset (Gao et al., 2020).

Data and Resources

Cite this as

Zhexin Zhang, Jiaxin Wen, Minlie Huang (2024). Dataset: LM-Extraction benchmark. https://doi.org/10.57702/oyk61fti

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2307.04401
Author Zhexin Zhang
More Authors
Jiaxin Wen
Minlie Huang
Homepage https://github.com/google-research/lm-extraction-benchmark