110 datasets found

Tags: Text Classification

Filter Results
  • AG News, SogouNews and DBpedia

    The AG News, SogouNews and DBpedia datasets are used for text classification experiments.
  • Experimental Results

    The authors evaluate the performance of their proposed conformal prediction methods for multistep feedback covariate shift (MFCS) on synthetic black-box optimization and active...
  • Amazon Reviews

    The Amazon Reviews dataset is used to predict the usefulness of Amazon reviews using off-the-shelf argumentation mining.
  • WebKB

    The dataset used in this paper is a probabilistic logic programming dataset, which is a probabilistic version of the WebKB dataset.
  • Reuters-8

    The Reuters-8 dataset is a collection of news articles from Reuters.
  • 20Newsgrp

    The 20Newsgrp dataset is a collection of news articles from 20 different newsgroups.
  • MSMARCO

    The dataset used for training and evaluating IR systems, containing a large collection of documents and queries.
  • Twitter

    Dialogue systems – often referred to as conversational agents, chatbots, etc. – provide convenient human-machine interfaces and have become increasingly prevalent with the...
  • News

    The News dataset consists of 5000 randomly sampled news articles from the NY Times corpus. It simulates the opinions of media consumers on news items. The units are different...
  • Emotion Classification

    The Emotion Classification dataset consists of emotion-related text.
  • X-FORMAL

    X-FORMAL dataset contains pairs of formal and informal texts in four languages: Brazilian Portuguese, French, Italian, and English.
  • GYAFC

    The GYAFC dataset is a formality transfer dataset for English that contains aligned formal and informal sentences from two domains: Entertainment & Music and Family &...
  • MARC

    The MARC dataset is a multilingual text classification dataset that contains 6 languages.
  • M10

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • 20 NewsGroups

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • MR, Subj, SST-1, SST-2, MPQA

    The dataset used in this paper for text classification task.
  • 20NEWS Dataset

    The dataset used in the paper is the 20NEWS dataset, consisting of 18,845 text documents with 20 topic labels.
  • TEL-NLP

    The TEL-NLP dataset is a collection of Telugu text data for four NLP tasks: sentiment analysis, emotion identification, hate speech detection, and sarcasm detection.
  • Yelp Dataset

    The Yelp Dataset contains 1.6M reviews and 500K tips by 366K users for 61K businesses; 481K business attributes, such as hours, parking availability, ambience; and check-ins for...
  • IMDB Sentiment Classification

    The IMDB sentiment classification dataset is used for text classification tasks.
You can also access this registry using the API (see API Docs).