Natural Language Generation - Groups

Posterior Control of Blackbox Generation

Text generation often requires high-precision output that obeys task-specific rules. This fine-grained control is difficult to enforce with off-the-shelf deep learning models.

Dataset
JSON

SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot

The SHROOM dataset is a collection of data points containing task, input text, target text, and generated text. The dataset is used for hallucination detection in natural...

Dataset
JSON

KiUT: Knowledge-injected U-Transformer for Radiology Report Generation

Radiology report generation aims to automatically generate a clinically accurate and coherent paragraph from the X-ray image, which could relieve radiologists from the heavy...

Dataset
JSON

WMT’17 metrics task

The dataset used in the paper for validation studies of automatic metrics in natural language generation evaluation

Dataset
JSON

Famous Keyword Twitter Replies

The Famous Keyword Twitter Replies dataset is a comprehensive collection of Twitter data that focuses on popular keywords and their associated replies.

Dataset
JSON

Text Summarization

The dataset used for the text summarization task, where a summarizer produces an utterance made up of one or multiple sentences to succinctly report the main content of a text.

Dataset
JSON

Reference Games

The dataset used for the reference games task, where participants produce descriptions that allow comprehenders to identify the correct referent out of a set of candidates.

Dataset
JSON

LLaMA

The dataset used in the paper is LLaMA, a large language model.

Dataset
JSON

1-billion-word

1-billion-word dataset

Dataset
JSON

CMU-SE

CMU-SE dataset

Dataset
JSON

Chinese poetry generation

Chinese poetry generation dataset

Dataset
JSON

Contextual Description Evaluation

The dataset used in the paper to evaluate the effectiveness of referenceless metrics for image accessibility.

Dataset
JSON

ACE 2005, WebNLG, CoNLL, NYT, and FB15k-237

The dataset used in the paper is ACE 2005, WebNLG, CoNLL, NYT, and FB15k-237. The ACE 2005 dataset is a collection of news articles, while WebNLG is a corpus used for natural...

Dataset
JSON

Cross-modal Memory Networks for Radiology Report Generation

Radiology report generation using cross-modal memory networks

Dataset
JSON

VideoIC

VideoIC dataset for automatic live video commenting

Dataset
JSON

Livebot

Livebot dataset for automatic live video commenting

Dataset
JSON

Sentiment-oriented Transformer-based Variational Autoencoder Network for Live...

Sentiment-oriented Transformer-based Variational Autoencoder (So-TVAE) for Live Video Commenting

Dataset
JSON

HiTab

A hierarchical table dataset for question answering and natural language generation.

Dataset
JSON

Diffusion-LM Improves Controllable Text Generation

Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. We develop a new non-autoregressive language model...

Dataset
JSON

VOLTA: Improving Generative Diversity by Variational Mutual Information Maxim...

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used six datasets from three different NLG tasks.

Dataset
JSON

24 datasets found