103 datasets found

Tags: text classification

Filter Results
  • Amazon Reviews

    The Amazon Reviews dataset is used to predict the usefulness of Amazon reviews using off-the-shelf argumentation mining.
  • news20

    The news20 dataset is a multiclass text classification dataset.
  • sector

    The sector dataset is a multiclass text classification dataset.
  • rcv1

    The rcv1 dataset is a multiclass text classification dataset.
  • Yelp reviews polarity dataset

    Yelp reviews polarity dataset
  • Cnews dataset

    The Cnews dataset is a collection of news articles from Sina News, filtered from 2005 to 2011. The dataset contains 10 categories of news, including sports, entertainment, home...
  • IMDB Sentiment

    The dataset used for training and evaluation of the proposed RRHF paradigm.
  • CNN/DailyMail

    A bus driver who was seriously injured when he was hit by a steam engine is making good progress, his wife has said.
  • Ren-CECps

    Multi-label text classification dataset Ren-CECps
  • RCV1-v2

    Multi-label text classification dataset RCV1-v2, Reuters Corpus Volume I
  • Reuters-21578

    Text classification problem has long been an interesting research field, the aim of text classification is to develop algorithm to find the categories of given documents.
  • Yelp Dataset Challenge

    The Yelp dataset challenge contains reviews and images of restaurants, with the goal of recommending images for each review.
  • Natural Instructions

    The Natural Instructions (NI) dataset used for evaluating the performance of the DEPTH model on natural language understanding tasks.
  • DiscoEval

    The DiscoEval dataset used for evaluating the performance of the DEPTH model on discourse-related tasks.
  • C4

    The dataset used for pre-training language models, containing a large collection of text documents.
  • Amazon@Beauty and Amazon@Books datasets

    The Amazon@Beauty dataset is a collection of product reviews from Amazon.com, and the Amazon@Books dataset is a collection of product reviews from Amazon.com.
  • The pushshift reddit dataset

    The pushshift reddit dataset
  • IMDB dataset

    The IMDB dataset is a polarity dataset for sentiment analysis or text classification, it contains 50000 sentences and their binary class labels, being either "Positive" or...
  • SST-2

    The dataset used for the experiments across ten models– ranging from bag-of-words models to pre-trained transformers– and find that a model having higher AUC does not necessarily...
  • Uniter dataset

    The Uniter dataset is a multimodal learning dataset, which consists of images and corresponding text.
You can also access this registry using the API (see API Docs).