-
Accountant Corpus
The dataset is used to analyze corpora in a completely language independent and unsupervised way without any prior linguistic knowledge. -
VIDEO and NEWS datasets
The paper uses two real-world datasets: VIDEO and NEWS. -
Reddit News Topical Interactions
The dataset used in this study has been gathered from the Pushshift Reddit repository, containing archives of the entirety of Reddit posts and comments up to June 2021. -
News Articles Dataset
The dataset used in this paper is a collection of news articles from an international news website, covering a time span from September 2012 to April 2014.