16 datasets found

Tags: Topic Modeling

Filter Results
  • Exact slice sampler for Hierarchical Dirichlet Processes

    Hierarchical Dirichlet Process (HDP) mixture model for modeling the hierarchy of groups of data.
  • NOISE dataset

    The NOISE dataset is a semi-synthetic dataset constructed from the matrix A∗, where the data is generated from y = A∗x + ζ, where ζ is the noise.
  • NEG dataset

    The NEG dataset is a semi-synthetic dataset constructed from the matrix A∗, where the entries of A∗ are i.i.d. samples from the uniform distribution on [−0.5, 0.5).
  • CTM dataset

    The CTM dataset is a semi-synthetic dataset constructed from the matrix X, whose columns are drawn from the logistic normal prior in the correlated topic model.
  • DIR dataset

    The DIR dataset is a semi-synthetic dataset constructed from the matrix X, whose columns are from a Dirichlet distribution with parameters (0.05, 0.05,..., 0.05).
  • Enrico

    Enrico: A dataset for topic modeling of mobile UI designs
  • SearchSnippets

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • M10

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • 20 NewsGroups

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • Graphbtm Dataset

    The Graphbtm dataset is a biterm topic model.
  • RCV1 Dataset

    The RCV1 dataset is a corpus of Reuters news articles.
  • News Articles Dataset

    The dataset used in this paper is a collection of news articles from an international news website, covering a time span from September 2012 to April 2014.
  • StackOverflow

    The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
  • Wikipedia Comparable Corpora

    Multilingual dataset for topic modeling based on aligned Wikipedia articles extracted from Wikipedia Comparable Corpora
  • Synthetic Dataset

    The dataset used in this work is a custom synthetic dataset generated using the liquid-dsp library, containing 600000 examples of each of 13.8 million examples, with SNRs...
  • Subjectivity Dataset

    The Subjectivity dataset is a dataset provided by [Pang and Lee, 2004].