-
Yahoo and Yelp corpora
The Yahoo and Yelp corpora dataset contains 100k sentences with greater average length. -
Training CLIP models on Data from Scientific Papers
Contrastive Language-Image Pretraining (CLIP) models are trained with datasets extracted from web crawls, which are of large quantity but limited quality. This paper explores... -
Goal Driven Discovery of Distributional Differences via Language Descriptions
Describing differences between text distributions with natural language. -
Validation Dataset
The Validation Dataset is used for validation, it contains 1428 images from nine distinct rooms. -
LV-BERT: Exploiting Layer Variety for BERT
Modern pre-trained language models are mostly built upon backbones stacking self-attention and feed-forward layers in an interleaved order. This paper aims to improve... -
CIFAR-10, CIFAR-100, Stanford background dataset, VOC2012 dataset, Rotten Tom...
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used CIFAR-10 and CIFAR-100 datasets for image classification, and Stanford... -
Penn Treebank
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths. -
GPT-2 small
The dataset used in this paper is a large language model, GPT-2 small, and its residual stream activations. -
GLOW : Global Weighted Self-Attention Network for Web Search
GLOW is a novel Global Weighted Self-Attention Network for web document search. It leverages global corpus statistics into the deep matching model. -
BERT: Pre-training of deep bidirectional transformers for language understanding
This paper proposes BERT, a pre-trained deep bidirectional transformer for language understanding. -
Text-to-Image Synthesis Dataset
This dataset is used for text-to-image synthesis. -
SST2, SST5, MR, IMDB, Ag news
The dataset used for sentence classification task -
Ego4D Goal-Step
The Ego4D Goal-Step dataset is a large-scale egocentric video dataset that contains 3,000 hours of egocentric video. The dataset is used for action recognition, action... -
String Transformation Tasks
A publicly available data set of 130 real world string transformation tasks from Cropper and Dumancic [2020]. -
GLUE development set
The GLUE development set is a dataset used for evaluating the performance of language models. -
LLaMA-7B and LLaMA-13B models
The dataset used in this paper is not explicitly mentioned, but it is mentioned that the authors used the LLaMA-7B and LLaMA-13B models, and the GLUE development set.