-
ACL Anthology
The ACL Anthology dataset contains papers on natural language processing, including citation patterns, authorship, and language use over time. -
Source Code Authorship Identification Dataset
The dataset used was raw source code Java files taken from the GitHub repositories of various authors.