9 datasets found

Tags: topic modeling

Filter Results
  • CAL500

    Text categorization, a document may be associated with a range of topics, such as science, entertainment, and news.
  • 20News

    Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging.
  • 20Newsgroups dataset

    The 20Newsgroups data set is a dataset of 18,846 instances of newsgroup documents.
  • AGNews Dataset

    The AGNews dataset is a collection of news articles, where each article is labeled with a topic (e.g. politics, sports, etc.).
  • GoogleNews

    The dataset used in this paper is a collection of news articles from Google News.
  • Wiki20K

    The dataset used in this paper is a collection of English Wikipedia abstracts from DBpedia.
  • 20NewsGroups

    The dataset used in this paper is a collection of documents from various domains, including news, articles, and emails.
  • Wikitext-103

    The dataset used in this paper is Wikitext-103, a general English language corpus containing good and featured Wikipedia articles.
  • Reuters RCV1-v2

    The Reuters RCV1-v2 contains 804,414 newswire articles. There are 103 topics which form a tree hierarchy. Thus documents typically have multiple labels. The data was randomly...