6 datasets found

Groups: Language Tags: Language

Filter Results
  • Chinese Corpus

    The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
  • Accountant Corpus

    The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
  • Medline Corpus

    The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
  • Wittgenstein Corpus

    The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
  • EU-Parliament Corpus

    The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
  • Wikipedia Corpus

    The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
You can also access this registry using the API (see API Docs).