-
The E2E dataset
The E2E dataset contains restaurant reviews labeled by 8 fields including food type, price, and customer rating. -
MTTN: Multi-Pair Text to Text Narratives for Prompt Generation
A large-scale dataset for generating prompts that can be used in diffusion models for text-to-text generation tasks. -
CLIP-GLaSS
The dataset used for the text-to-image task consists of 20 context tokens, to which three fixed tokens have been concatenated, representing the static context "the picture of". -
Wikitext-2
The dataset used in this paper is not explicitly described. However, it is mentioned that the authors used the Wikitext-2 dataset for text generation tasks. -
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Diffusion models have emerged as a power-ful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models... -
Language models are few-shot learners
A language model that demonstrates capabilities in processing and generating human-like text. -
Prompt Highlighter
Prompt Highlighter is a novel paradigm for user-model interactions in multi-modal LLMs, offering output control through a token-level highlighting mechanism. -
STC dataset
The STC dataset is a short text conversation dataset used for evaluating the performance of conversation response generation models. -
Wikitext-103
The dataset used in this paper is Wikitext-103, a general English language corpus containing good and featured Wikipedia articles. -
Synthetic Dataset
The dataset used in this work is a custom synthetic dataset generated using the liquid-dsp library, containing 600000 examples of each of 13.8 million examples, with SNRs... -
SeqDiffuSeq
The dataset used in the SeqDiffuSeq paper for sequence-to-sequence text generation.