The dataset used in the paper is the Stanford Natural Language Inference (SNLI) dataset, which consists of 549,367 premise-hypothesis pairs for train/dev/test sets and target...
Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have...