-
UniHPF: Universal Healthcare Predictive Framework
The UniHPF dataset is a collection of Electronic Health Records (EHRs) from various hospitals, including MIMIC-III, MIMIC-IV, and eICU. -
Yelp-Health
The Yelp-Health dataset used in the paper to demonstrate the leakage of dataset properties in multi-party machine learning. -
Healthcare
The dataset used in the paper is not explicitly described, but it is mentioned that the authors worked with five real-world domains: child welfare, education, cybersecurity,... -
ICD-9 dataset
The ICD-9 dataset contains diagnoses (ICD-9) codes from medical record data. -
TuDiabetes Forum
TuDiabetes Forum: We also collected a dataset from the TuDiabetes forum, a popular diabetes community operated by the Diabetes Hands Foundation. -
BGnow, TuDiabetes Forum
BGnow dataset is derived from diabetic users who actively share their wellness data on Twitter. TuDiabetes Forum: We also collected a dataset from the TuDiabetes forum, a... -
Diabetes Support Group, BGnow, TuDiabetes Forum
Diabetes Support Group dataset is collected from posts of users who follow and participate in diabetes support groups like “diabeteslife” or “diabetesconnect” on Twitter. BGnow... -
COVID-19 Vaccination Search Insights
COVID-19 Vaccination Search Insights dataset is a collection of anonymized search queries and their corresponding labels, which indicate whether the query is related to COVID-19... -
CoAID: COVID-19 Healthcare Misinformation Dataset
The CoAID dataset includes 4251 health-related fake news posted on websites and social platforms. -
Heart Failure and COVID-19 Datasets
The dataset is used for outcome-aware stratification of patients with heart failure and COVID-19. The dataset contains features such as demographics, medical events, and... -
Patient Treatment Classification dataset
The Patient Treatment Classification dataset comprises Electronic Health Records collected from a private hospital in Indonesia. -
Polaris: A Safety-focused LLM Constellation for Healthcare
The Polaris dataset is a collection of conversations between a patient and a healthcare agent, with the goal of developing a safety-focused Large Language Model (LLM)... -
Lab Measurements and Diagnosis Information
The dataset used in this paper is a large-scale dataset of lab measurements and diagnosis information for 298,000 individuals. -
Partner for Kids (PFK) claims data
The claims data, containing medical codes, services information, and incurred expenditure, can be a good resource for estimating an individual’s health condition and medical... -
Hepatitis Patients
Hepatitis Patients dataset consists of 20 features and 155 observations -
Liver Patients
Liver Patients dataset consists of 583 observations and 11 features -
Breast Cancer Wisconsin (Original)
Breast Cancer Wisconsin (Original) dataset consists of 699 observations and 11 features -
PhysioNet Challenge 2021
The dataset used in this paper is a collection of electrocardiogram (ECG) signals from healthy individuals, with a focus on accurately capturing the subtleties of ECG morphology.