-
Dual-sparse Regularized Randomized Reduction
The paper proposes dual-sparse regularized randomized reduction methods for classification. The dataset used in the paper is the RCV1-binary dataset. -
Liar, Liar Pants on Fire: A New Benchmark Dataset for Fake News Detection
A new benchmark dataset for fake news detection, containing 12,836 short statements labeled for truthfulness, subject, context/venue, speaker, state, party, and prior history. -
news20-binary
The dataset used in the paper is the news20-binary dataset. -
E2006-log1p
The dataset used in the paper is the E2006-log1p dataset. -
Amazon dataset
The Amazon dataset is used to evaluate the performance of the proposed approach. It consists of 2000 users, 1500 items, 86690 reviews, 7219 number ratings, 3.6113 average number... -
Guilt Detection in Text: A Step Towards Understanding Complex Emotions
A dataset for guilt detection in text, laying the groundwork for future research and advancements in understanding guilt through NLP methods. -
Regret Detection and Domain Identification Dataset (ReDDIT)
A novel dataset tailored to dissect the relationship between guilt and regret and their unique textual markers. -
SST-1, SST-2, SUBJ, IMDB
The dataset used for text classification tasks, including SST-1, SST-2, SUBJ, and IMDB. -
Sent140 dataset
The dataset used in the paper is a real-world dataset for sentiment analysis. -
Online news popularity data
The dataset contains features about articles published by Mashable web site over a period of two years. -
Twitter Dataset
The Twitter Dataset is a collection of tweets annotated with Plutchik's emotions, consisting of tweets in three different languages: English, Dutch, and German. -
Hatespeech
The Hatespeech dataset is a collection of tweets containing lexicons used in hate speech. -
C4 dataset
The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors trained a GPT2 transformer language model on the C4 dataset.