-
LOCO dataset
The LOCO dataset consists of a large number of documents collected from 58 conspiracy theories media sources and 92 mainstream media sources. -
Media Frames Corpus
A dataset of annotated news articles and social media posts for frame classification. -
BFRS Dataset
The BFRS dataset contains news stories from Pakistan with labels for various categories related to political violence. -
Crowd Counting Consortium
The Crowd Counting Consortium dataset contains news stories from Pakistan with labels for various categories. -
Berita Dataset
The Berita dataset consists of 50304 digital Indonesia news articles shared online through Twitter. -
Reuters Dataset
The Reuters dataset is a text classification dataset containing 21,578 samples. -
NYT Dataset
The NYT dataset is a collection of articles published between 2012 and 2022. -
Patrika Dataset
Patrika dataset is used as independent test set. -
Nayadiganta Dataset
Nayadiganta dataset is used as independent test set. -
Hindinews and Livehindustan Articles
Hindinews, Livehindustan and Patrika newspaper articles available open source in Kaggle encompassing similar domains. -
Bengali and Hindi News Articles
Bengali dataset consists of articles from online public news portals such as Prothom-Alo, BDNews24 and Nayadiganta. The articles encompass domains such as politics,... -
Disin dataset
The Disin dataset is a fake news dataset on Kaggle, including 12,600 fake news articles and 12,600 truthful news articles. -
News Articles Dataset
The dataset used in this paper is a collection of news articles from an international news website, covering a time span from September 2012 to April 2014.