Dataset - LDM

Chinese Corpus

The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
- Dataset
- JSON
Accountant Corpus

The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
- Dataset
- JSON
Medline Corpus

The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
- Dataset
- JSON
Wittgenstein Corpus

The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
- Dataset
- JSON
EU-Parliament Corpus

The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge.
- Dataset
- JSON
Wikipedia Corpus

The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

6 datasets found