-
GMEG-wiki and GMEG-yahoo
The GMEG-wiki and GMEG-yahoo datasets are used to evaluate the proposed approach. -
CoNLL-2014
The task of grammatical error correction (GEC) is to map an ungrammatical sentence xbad into a grammatical version of it, xgood. -
LM-Critic: Language Models for Unsupervised Grammatical Error Correction
Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive. -
GLUE benchmark
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank... -
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based, R...
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based, Role-Playing Video Games -
Contra State Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
Contra Instruction Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
Contra Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
SlimPajama
The dataset is used to evaluate the performance of the xLSTM architecture on various tasks, including language modeling, question answering, and text classification. -
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Diffusion models have emerged as a power-ful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models... -
How do large language models capture the ever-changing world knowledge?
This paper presents a review of recent advances in large language models' ability to capture ever-changing world knowledge. -
Masked Acoustic Unit for Mispronunciation Detection and Correction
The proposed method uses the acoustic unit (AU) as the intermediary feature for both mispronunciation detection and correction. -
Language models are few-shot learners
A language model that demonstrates capabilities in processing and generating human-like text. -
Self-Supervised Alignment with Mutual Information
The dataset is used for training a language model to follow behavioral principles without the use of preference labels, demonstrations, or human oversight. -
NLPositionality
NLPositionality is a framework for characterizing design biases and quantifying the positionality of NLP datasets and models.