-
MTG: A Benchmark Suite for Multilingual Text Generation
MTG is a multilingual multiway text generation benchmark suite. It is the first-proposed multilingual multiway text generation dataset with the largest human-annotated data... -
News-to-Report Dataset
A dataset for automatically generating macro research reports from economic news. -
SQuAD: 100,000+ Questions for Machine Comprehension of Text
The SQuAD dataset is a benchmark for natural language understanding tasks, including question answering and text classification. -
Bold Dataset
The BOLD dataset contains professional prompts for text generation, focusing on gender equality. -
Content Preserving Text Generation with Attribute Controls
The dataset used in this paper for text generation with attribute controls. -
Towards a unified multi-dimensional evaluator for text generation
The NewsRoom dataset consists of 60 input source texts and 7 output summaries for each sample. -
Wikipedia Neutrality Corpus
This dataset is used to test the ability of large language models to detect and correct biased Wikipedia edits according to Wikipedia's Neutral Point of View (NPOV) policy. -
ROCStories (+GPT-J)
A corpus and cloze evaluation for deeper understanding of commonsense stories. -
ROCStories
The ROCStories corpus is a collection of crowdsourced five-sentence everyday stories rich in causal and temporal relations. -
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
A corpus and cloze evaluation for deeper understanding of commonsense stories. -
The E2E dataset
The E2E dataset contains restaurant reviews labeled by 8 fields including food type, price, and customer rating. -
PersonaChat
Persona-Chat is sourced from authentic conversations between human annotators who are randomly matched and assigned persona information. -
CLIP-GLaSS
The dataset used for the text-to-image task consists of 20 context tokens, to which three fixed tokens have been concatenated, representing the static context "the picture of". -
Wikitext-2
The dataset used in this paper is not explicitly described. However, it is mentioned that the authors used the Wikitext-2 dataset for text generation tasks. -
Wizard of Wikipedia
Wizard of Wikipedia is a recent, large-scale dataset of multi-turn knowledge-grounded dialogues between a “apprentice” and a “wizard”, who has access to information from...