English Gigaword Corpus

The English monolingual corpus used to create synthetic data for training models by back-translation.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Meng Sun, Bojian Jiang, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang (2024). Dataset: English Gigaword Corpus. https://doi.org/10.57702/fn3hvzev

DOI retrieved: November 25, 2024

Field	Value
Created	November 25, 2024
Last update	November 25, 2024
Defined In	https://doi.org/10.18653/v1/W19-5341
Author	Meng Sun
More Authors	Bojian Jiang Hao Xiong Zhongjun He Hua Wu Haifeng Wang