-
ATIS2 and ATIS3
The ATIS2 and ATIS3 datasets are used to create low-latency natural language understanding components. -
General Language Understanding Evaluation (GLUE) dataset
The General Language Understanding Evaluation (GLUE) dataset is a dataset used in the paper to evaluate the performance of natural language understanding models. -
FewCLUE dataset
The FewCLUE dataset is a Chinese few-shot learning evaluation benchmark. -
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language U...
WALNUT is a benchmark for semi-weakly supervised learning for natural language understanding. It consists of 8 NLU tasks with different types, including document-level and... -
ROCStories (+GPT-J)
A corpus and cloze evaluation for deeper understanding of commonsense stories. -
ROCStories
The ROCStories corpus is a collection of crowdsourced five-sentence everyday stories rich in causal and temporal relations. -
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
A corpus and cloze evaluation for deeper understanding of commonsense stories. -
GLUE benchmark
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank... -
StackOverflow
The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models. -
BERT: Pre-training of deep bidirectional transformers for language understanding
This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding. -
GLUE development set
The GLUE development set is a dataset used for evaluating the performance of language models.