-
Chinese Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
Accountant Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
Medline Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
Wittgenstein Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
EU-Parliament Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
Wikipedia Corpus
The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...