Text Classification - Groups

IMDB Sentiment Classification

The IMDB sentiment classification dataset is used for text classification tasks.
- Dataset
- JSON
CNN/DailyMail

A bus driver who was seriously injured when he was hit by a steam engine is making good progress, his wife has said.
- Dataset
- JSON
Ren-CECps

Multi-label text classification dataset Ren-CECps
- Dataset
- JSON
RCV1-v2

Multi-label text classification dataset RCV1-v2, Reuters Corpus Volume I
- Dataset
- JSON
20-Newsgroups dataset

The 20-Newsgroups dataset is a collection of text documents.
- Dataset
- JSON
Twitter and Pinterest dataset

The dataset used for the experiments on Twitter and Pinterest.
- Dataset
- JSON
REDDIT-BINARY dataset

The REDDIT-BINARY dataset contains 2,000 graphs labeled as question/answer-based or discussion-based community in the content-aggregation website Reddit.
- Dataset
- JSON
Full

The dataset used for sentiment analysis and topic classification tasks.
- Dataset
- JSON
Polarity

The dataset used for sentiment analysis and topic classification tasks.
- Dataset
- JSON
Yahoo

The Yahoo dataset used for training and testing the proposed model, containing leaked passwords.
- Dataset
- JSON
DBP

The dataset used for sentiment analysis and topic classification tasks.
- Dataset
- JSON
AG

The dataset used for sentiment analysis and topic classification tasks.
- Dataset
- JSON
BERT

The dataset used in this paper is a pre-trained BERT model trained on English Wikipedia and Books datasets.
- Dataset
- JSON
Reuters-21578

Text classiﬁcation problem has long been an interesting research ﬁeld, the aim of text classiﬁcation is to develop algorithm to ﬁnd the categories of given documents.
- Dataset
- JSON
Amazon Review

The Amazon Review dataset is a widely used benchmark dataset for cross-domain sentiment analysis.
- Dataset
- JSON
Text Classification based on Multiple Block Convolutional Highways

Text classification based on Multiple Block Convolutional Highways
- Dataset
- JSON
Yelp Dataset Challenge

The Yelp dataset challenge contains reviews and images of restaurants, with the goal of recommending images for each review.
- Dataset
- JSON
C4

The dataset used for pre-training language models, containing a large collection of text documents.
- Dataset
- JSON
Amazon@Beauty and Amazon@Books datasets

The Amazon@Beauty dataset is a collection of product reviews from Amazon.com, and the Amazon@Books dataset is a collection of product reviews from Amazon.com.
- Dataset
- JSON
OpenWebText Corpus

A dataset for language modeling, where the goal is to predict the next word in a sequence given the previous words.
- Dataset
- JSON

211 datasets found