-
Divar Dataset
A dataset for measuring the domain similarity of Persian texts, generated from a dataset of advertisements posted on Divar application. -
Towards Improving Selective Prediction Ability of NLP Systems
SNLI, MNLI, Stress Test, Matched Mismatched, Competence, Distraction, and Noise datasets -
AG News Dataset
The AG News - News articles from over 2000 news sources annotated by type of news: Sports, World, Business, and Science/Tech. 120k training and 7k test sets are provided. -
OTTER: Improving Zero-Shot Classification via Optimal Transport
Zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label... -
CNN/DailyMail and XSum
The CNN/DailyMail dataset is a collection of news articles, and the XSum dataset is a collection of news articles with summaries. -
Diggs dataset
The dataset used for testing the sLDA model [16]. -
ImageNet and SST2 datasets
The dataset used in this study for image and text classification tasks. -
LLM dataset
The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their... -
MMLU dataset
The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),... -
SST-2, Irony, IronyB, TREC6, and SNIPS
The dataset used in this paper is SST-2, Irony, IronyB, TREC6, and SNIPS. -
CIFAR-100 and AGNews
Two datasets used for multi-task learning, CIFAR-100 and AGNews. -
Sem2015-Laptop
The dataset used for Aspect-Based Sentiment Analysis (ABSA) experiments. -
Sem2015-Restaurant
The dataset used for Aspect-Based Sentiment Analysis (ABSA) experiments. -
BeerAdvocate
The dataset used for Aspect-Based Sentiment Analysis (ABSA) experiments. -
CitySearch
The dataset used for Aspect-Based Sentiment Analysis (ABSA) experiments. -
A Million News Headlines, Fake and real news, Getting Real about Fake News
The dataset is a combination of 3 singular datasets: A Million News Headlines, Fake and real news, Getting Real about Fake News. -
Rotten Tomatoes
The Rotten Tomatoes dataset has 5331 positive and 5331 negative review sentences.