Text Classification - Groups

Diggs dataset

The dataset used for testing the sLDA model [16].
- Dataset
- JSON
ImageNet and SST2 datasets

The dataset used in this study for image and text classification tasks.
- Dataset
- JSON
LLM dataset

The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their...
- Dataset
- JSON
MMLU dataset

The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),...
- Dataset
- JSON
Bibtex

The dataset is used for multilabel learning tasks. It contains 7395 documents, each labeled with 159 relevant tickers.
- Dataset
- JSON
SST-2, Irony, IronyB, TREC6, and SNIPS

The dataset used in this paper is SST-2, Irony, IronyB, TREC6, and SNIPS.
- Dataset
- JSON
AGNews

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a variety of datasets for semi-supervised learning tasks.
- Dataset
- JSON
CIFAR-100 and AGNews

Two datasets used for multi-task learning, CIFAR-100 and AGNews.
- Dataset
- JSON
A Million News Headlines, Fake and real news, Getting Real about Fake News

The dataset is a combination of 3 singular datasets: A Million News Headlines, Fake and real news, Getting Real about Fake News.
- Dataset
- JSON
Rotten Tomatoes

The Rotten Tomatoes dataset has 5331 positive and 5331 negative review sentences.
- Dataset
- JSON
Harry Potter unlearning dataset

The dataset used in the paper is a concatenation of the original Harry Potter books and synthetic discussions, blog posts, and wiki-like entries about the books.
- Dataset
- JSON
Sample Selection for Data Augmentation in Natural Language Processing

Deep learning-based text classification models need abundant labeled data to obtain competitive performance. To tackle this, multiple researches try to use data augmentation to...
- Dataset
- JSON
FNID: Fake News Inference Dataset

A dataset for fake news inference
- Dataset
- JSON
Detecting Opinion Spams and Fake News Using Text Classification

A dataset for opinion spam and fake news detection
- Dataset
- JSON
Liar, Liar Pants on Fire: A New Benchmark Dataset for Fake News Detection

A new benchmark dataset for fake news detection, containing 12,836 short statements labeled for truthfulness, subject, context/venue, speaker, state, party, and prior history.
- Dataset
- JSON
TREC dataset

The dataset used in the paper is the TREC dataset, which consists of 124 queries.
- Dataset
- JSON
Amazon dataset

The Amazon dataset is used to evaluate the performance of the proposed approach. It consists of 2000 users, 1500 items, 86690 reviews, 7219 number ratings, 3.6113 average number...
- Dataset
- JSON
Text Classification as Matching

Many-class text classification is formulated as a matching problem between the input texts and the class descriptions.
- Dataset
- JSON
Newsgroups 4

The dataset used in this paper for Dominant Set Clustering.
- Dataset
- JSON
Newsgroups 3

The dataset used in this paper for Dominant Set Clustering.
- Dataset
- JSON

84 datasets found