20 datasets found

Tags: Topic Modeling

Filter Results
  • AGNews

    The dataset used in the paper is not explicitly described, but it is mentioned that the authors used a variety of datasets for semi-supervised learning tasks.
  • Exact slice sampler for Hierarchical Dirichlet Processes

    Hierarchical Dirichlet Process (HDP) mixture model for modeling the hierarchy of groups of data.
  • NOISE dataset

    The NOISE dataset is a semi-synthetic dataset constructed from the matrix A∗, where the data is generated from y = A∗x + ζ, where ζ is the noise.
  • NEG dataset

    The NEG dataset is a semi-synthetic dataset constructed from the matrix A∗, where the entries of A∗ are i.i.d. samples from the uniform distribution on [−0.5, 0.5).
  • CTM dataset

    The CTM dataset is a semi-synthetic dataset constructed from the matrix X, whose columns are drawn from the logistic normal prior in the correlated topic model.
  • DIR dataset

    The DIR dataset is a semi-synthetic dataset constructed from the matrix X, whose columns are from a Dirichlet distribution with parameters (0.05, 0.05,..., 0.05).
  • Enrico

    Enrico: A dataset for topic modeling of mobile UI designs
  • SearchSnippets

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • M10

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • 20 NewsGroups

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • 20NEWS Dataset

    The dataset used in the paper is the 20NEWS dataset, consisting of 18,845 text documents with 20 topic labels.
  • Graphbtm Dataset

    The Graphbtm dataset is a biterm topic model.
  • RCV1 Dataset

    The RCV1 dataset is a corpus of Reuters news articles.
  • Reddit News Topical Interactions

    The dataset used in this study has been gathered from the Pushshift Reddit repository, containing archives of the entirety of Reddit posts and comments up to June 2021.
  • News Articles Dataset

    The dataset used in this paper is a collection of news articles from an international news website, covering a time span from September 2012 to April 2014.
  • StackOverflow

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • Wikipedia Comparable Corpora

    Multilingual dataset for topic modeling based on aligned Wikipedia articles extracted from Wikipedia Comparable Corpora
  • Synthetic Dataset

    The dataset used in this work is a custom synthetic dataset generated using the liquid-dsp library, containing 600000 examples of each of 13.8 million examples, with SNRs...
  • Topic modeling of multimodal data: an autoregressive approach

    Topic modeling of multimodal data: an autoregressive approach
  • Subjectivity Dataset

    The Subjectivity dataset is a dataset provided by [Pang and Lee, 2004].
You can also access this registry using the API (see API Docs).