-
LLM dataset
The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their... -
Penn Tree Bank
The Penn Tree Bank dataset is a corpus split into a training, validation and testing set of 929k words, a validation set of 73k words, and a test set of 82k words. The... -
Experimental Results
The authors evaluate the performance of their proposed conformal prediction methods for multistep feedback covariate shift (MFCS) on synthetic black-box optimization and active... -
SlimPajama
The dataset is used to evaluate the performance of the xLSTM architecture on various tasks, including language modeling, question answering, and text classification. -
Natural Questions
The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer. -
Stanford Alpaca
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used CIFAR-10 and CIFAR-100 datasets for image classification, and ImageNet-100... -
Penn Treebank
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths.