103 datasets found

Tags: text classification

Filter Results
  • P-Stance, SemEval-2016, and MTSD datasets

    Stance detection dataset for political stance detection, including P-Stance, SemEval-2016, and MTSD datasets.
  • UNIREX

    The UNIREX framework extends the approach to a more general setting.
  • Movie Reviews

    The Movie Reviews dataset contains positive and negative movie reviews with rationales annotated by humans to support classification.
  • DBpedia Animals

    The DBpedia Animals dataset comprises 10,000 English Wikipedia article abstracts for animal species.
  • DynaSent

    The DynaSent dataset contains approximately 122,000 sentences, each labeled as positive, neutral, or negative.
  • Rcv1: A new benchmark collection for text categorization research

    Rcv1: A new benchmark collection for text categorization research.
  • AGNews, 20News, NYT, IMDB

    AGNews, 20News, NYT, IMDB are datasets used for weakly supervised text classification.
  • HateXplain

    The HateXplain dataset, containing 20,000 posts from Gab and Twitter, annotated with hate/offensive/normal labels.
  • CLIMABENCH

    CLIMABENCH is a benchmark of climate-related text classification tasks. It collates five existing climate change-related text datasets, including CLIMATEXT, CLIMATESTANCE,...
  • AllNews

    The dataset used in this paper is a collection of news articles from AllNews.
  • Wiki40B

    The dataset used in this paper is a collection of documents from Wikipedia.
  • NeurIPS dataset

    The NeurIPS dataset is a collection of 7241 papers published in NeurIPS from 1987 to 2016.
  • Wikipedia dataset

    The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries...
  • 20News

    Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging.
  • Reuters Dataset

    The Reuters dataset is a text classification dataset containing 21,578 samples.
  • Text Classification Datasets

    The dataset used in the paper is a collection of adversarial examples and natural examples for text classification tasks.
  • Shakespeare dataset

    Mobile crowdsensing has gained significant attention in recent years and has become a critical paradigm for emerging Internet of Things applications. The sensing devices...
  • TFDS

    Text dataset for text classification and sentiment analysis tasks.
  • NYT

    Text summarization aims to extract essential information from a piece of text and transform the text into a concise version.
  • Reuters21578

    The problem of similarity search is to find the most similar items in a large collection to a query item of interest. Fast similarity search is at the core of many information...
You can also access this registry using the API (see API Docs).