You're currently viewing an old version of this dataset. To see the current version, click here.

PubMed, ArXiv, and Movies datasets

The dataset used in the paper is PubMed, ArXiv, and Movies. PubMed is a medical dataset consisting of research articles from the PubMed repository. The articles' subheadings denote the source and target domains, namely female and male patients. The labels represent different biological categories. ArXiv provides a collection of research paper abstracts, where labels represent subjects into which to categorize the abstract. The target and source domains are old and new articles scrapped from the ArXiv repository. The Movies dataset contains a collection of movie summaries that may belong to different genres. The source and target domains for the movie overviews are Wikipedia and IMDb, respectively.

Data and Resources

This dataset has no data

Cite this as

Miruna Betianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen (2024). Dataset: PubMed, ArXiv, and Movies datasets. https://doi.org/10.57702/lks2wyqm

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Author Miruna Betianu
More Authors
Abele Mălan
Marco Aldinucci
Robert Birke
Lydia Chen
Homepage https://arxiv.org