PubMed, ArXiv, and Movies datasets

The dataset used in the paper is PubMed, ArXiv, and Movies. PubMed is a medical dataset consisting of research articles from the PubMed repository. The articles' subheadings denote the source and target domains, namely female and male patients. The labels represent different biological categories. ArXiv provides a collection of research paper abstracts, where labels represent subjects into which to categorize the abstract. The target and source domains are old and new articles scrapped from the ArXiv repository. The Movies dataset contains a collection of movie summaries that may belong to different genres. The source and target domains for the movie overviews are Wikipedia and IMDb, respectively.

Data and Resources

Cite this as

Miruna Betianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen (2024). Dataset: PubMed, ArXiv, and Movies datasets. https://doi.org/10.57702/lks2wyqm

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Author Miruna Betianu
More Authors
Abele Mălan
Marco Aldinucci
Robert Birke
Lydia Chen
Homepage https://arxiv.org