-
Famous Keyword Twitter Replies
The Famous Keyword Twitter Replies dataset is a comprehensive collection of Twitter data that focuses on popular keywords and their associated replies. -
Text Summarization
The dataset used for the text summarization task, where a summarizer produces an utterance made up of one or multiple sentences to succinctly report the main content of a text. -
DUC2002, DUC2003, DUC2005 datasets
Multi-document summarization datasets -
Wikibio Dataset
Text summarization and data-to-text generation datasets -
Gigaword and New York Times Annotated Corpus
Text summarization and data-to-text generation datasets -
Towards a unified multi-dimensional evaluator for text generation
The NewsRoom dataset consists of 60 input source texts and 7 output summaries for each sample. -
TEDLIUM Corpus
The TEDLIUM corpus is a large-volume corpus used for speech recognition and text summarization. -
ELI5, FinanceQA, MultiNews, and QMSum datasets
The ELI5, FinanceQA, MultiNews, and QMSum datasets were used in the paper. -
SPACE and AMAZON datasets
The SPACE dataset contains hotel reviews, and the AMAZON dataset contains product reviews. -
CNN/DailyMail
A bus driver who was seriously injured when he was hit by a steam engine is making good progress, his wife has said. -
StreamHover: Livestream Transcript Summarization and Annotation
StreamHover is a framework for annotating and summarizing livestream transcripts. It uses a neural extractive summarization model that leverages vector-quantized variational...