Text Classification - Groups

MPQA Dataset

The MPQA dataset contains 10,606 opinions, and each of them is labeled as Objective or Subjective.
- Dataset
- JSON
CR Dataset

The MR dataset is a movie review repository (containing 10,662 reviews) while CR contains 3,775 reviews about products, e.g. a music player.
- Dataset
- JSON
Movie Review Repository (MR)

The word-level model consists of one convolutional layer, followed by a max pooling layer and a fully connected layer with dropout, and last a softmax output layer.
- Dataset
- JSON
DBpedia Ontology Dataset

Two representative DNN models and some corresponding datasets are chosen as the experiment targets to evaluate the effectiveness of the proposed method.
- Dataset
- JSON
Banknote Authentication

data extracted from real images of forged banknotes, with the help of an industrial camera.
- Dataset
- JSON
RTE dataset

RTE dataset
- Dataset
- JSON
FastText

The FastText dataset is a subword token embedding model. It produces a vector representation of a word based on composing embeddings of the character n-grams composing the word.
- Dataset
- JSON
Hatespeech

The Hatespeech dataset is a collection of tweets containing lexicons used in hate speech.
- Dataset
- JSON
Amazon Books

The Amazon Books dataset is a collection of user ratings for books, with each rating indicating the user's preference for the book.
- Dataset
- JSON
C4 dataset

The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors trained a GPT2 transformer language model on the C4 dataset.
- Dataset
- JSON
Penn Tree Bank

The Penn Tree Bank dataset is a corpus split into a training, validation and testing set of 929k words, a validation set of 73k words, and a test set of 82k words. The...
- Dataset
- JSON
UNIREX

The UNIREX framework extends the approach to a more general setting.
- Dataset
- JSON
Tweet Sentiment Extraction

The Tweet Sentiment Extraction dataset contains positive, negative, and neutral tweets with human-annotated rationales.
- Dataset
- JSON
Movie Reviews

The Movie Reviews dataset contains positive and negative movie reviews with rationales annotated by humans to support classification.
- Dataset
- JSON
RP dataset

The RP dataset, derived from the RELPRON dataset, consists of 105 noun phrases containing relative clauses.
- Dataset
- JSON
MC (Meaning Classification) dataset

The MC (Meaning Classification) dataset is a specially crafted dataset used for a classification task.
- Dataset
- JSON
Multi-Scale Feature Fusion Quantum Depthwise Convolutional Neural Networks fo...

Text classification is an important and widely studied task in natural language processing (NLP), with extensive applications such as sentiment analysis, topic classification,...
- Dataset
- JSON
IMDb Reviews

The dataset consists of 25000 reviews from IMDb.
- Dataset
- JSON
DBpedia Animals

The DBpedia Animals dataset comprises 10,000 English Wikipedia article abstracts for animal species.
- Dataset
- JSON
DynaSent

The DynaSent dataset contains approximately 122,000 sentences, each labeled as positive, neutral, or negative.
- Dataset
- JSON

182 datasets found