4 datasets found

Formats: JSON Tags: document classification

Filter Results
  • AllNews

    The dataset used in this paper is a collection of news articles from AllNews.
  • Wiki40B

    The dataset used in this paper is a collection of documents from Wikipedia.
  • Yahoo Answer and Yelp15 review

    Two large scale document classification datasets: Yahoo Answer and Yelp15 review, representing topic classification and sentiment classification data sets respectively.
  • CommonCrawl

    CommonCrawl is a non-profit organization that provides a large corpus of web pages for research and development purposes.
You can also access this registry using the API (see API Docs).