-
Individual Text Corpora Predict Openness, Interests, Knowledge and Level of E...
Individual text corpora (ICs) were generated from 214 participants with a mean number of 5 million word tokens. -
Wikipedia Corpus
The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...