CNN/Daily Mail corpus

The CNN/Daily Mail corpus contains pairs of online news articles and their summaries, consisting of approximately 287,000 training pairs, 13,368 validation pairs, and 11,490 testing pairs.

Data and Resources

Cite this as

Karl Moritz Hermann, Tomas Kociský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom (2024). Dataset: CNN/Daily Mail corpus. https://doi.org/10.57702/5iiuea9z

DOI retrieved: November 25, 2024

Additional Info

Field Value
Created November 25, 2024
Last update November 25, 2024
Defined In https://doi.org/10.18653/v1/W19-8664
Version non-anonymized version
Author Karl Moritz Hermann
More Authors
Tomas Kociský
Edward Grefenstette
Lasse Espeholt
Will Kay
Mustafa Suleyman
Phil Blunsom