-
GLUE benchmark
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank... -
BERT: Pre-training of deep bidirectional transformers for language understanding
This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding.