PubMed, ArXiv, and Movies datasets

doi:doi:10.57702/lks2wyqm

PubMed, ArXiv, and Movies datasets

The dataset used in the paper is PubMed, ArXiv, and Movies. PubMed is a medical dataset consisting of research articles from the PubMed repository. The articles' subheadings denote the source and target domains, namely female and male patients. The labels represent different biological categories. ArXiv provides a collection of research paper abstracts, where labels represent subjects into which to categorize the abstract. The target and source domains are old and new articles scrapped from the ArXiv repository. The Movies dataset contains a collection of movie summaries that may belong to different genres. The source and target domains for the movie overviews are Wikipedia and IMDb, respectively.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Miruna Betianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen (2024). Dataset: PubMed, ArXiv, and Movies datasets. https://doi.org/10.57702/lks2wyqm

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	Miruna Betianu
More Authors	Abele Mălan Marco Aldinucci Robert Birke Lydia Chen
Homepage	https://arxiv.org