-
Diggs dataset
The dataset used for testing the sLDA model [16]. -
ImageNet and SST2 datasets
The dataset used in this study for image and text classification tasks. -
LLM dataset
The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their... -
MMLU dataset
The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),... -
SST-2, Irony, IronyB, TREC6, and SNIPS
The dataset used in this paper is SST-2, Irony, IronyB, TREC6, and SNIPS. -
CIFAR-100 and AGNews
Two datasets used for multi-task learning, CIFAR-100 and AGNews. -
A Million News Headlines, Fake and real news, Getting Real about Fake News
The dataset is a combination of 3 singular datasets: A Million News Headlines, Fake and real news, Getting Real about Fake News. -
Rotten Tomatoes
The Rotten Tomatoes dataset has 5331 positive and 5331 negative review sentences. -
Harry Potter unlearning dataset
The dataset used in the paper is a concatenation of the original Harry Potter books and synthetic discussions, blog posts, and wiki-like entries about the books. -
Sample Selection for Data Augmentation in Natural Language Processing
Deep learning-based text classification models need abundant labeled data to obtain competitive performance. To tackle this, multiple researches try to use data augmentation to... -
FNID: Fake News Inference Dataset
A dataset for fake news inference -
Detecting Opinion Spams and Fake News Using Text Classification
A dataset for opinion spam and fake news detection -
Liar, Liar Pants on Fire: A New Benchmark Dataset for Fake News Detection
A new benchmark dataset for fake news detection, containing 12,836 short statements labeled for truthfulness, subject, context/venue, speaker, state, party, and prior history. -
TREC dataset
The dataset used in the paper is the TREC dataset, which consists of 124 queries. -
Amazon dataset
The Amazon dataset is used to evaluate the performance of the proposed approach. It consists of 2000 users, 1500 items, 86690 reviews, 7219 number ratings, 3.6113 average number... -
Text Classification as Matching
Many-class text classification is formulated as a matching problem between the input texts and the class descriptions. -
Newsgroups 4
The dataset used in this paper for Dominant Set Clustering. -
Newsgroups 3
The dataset used in this paper for Dominant Set Clustering.