VISE: Validated and Invalidated Symbolic Explanations for Knowledge Graph Integrity

VISE represents a novel hybrid strategy that integrates symbolic learning, constraint validation, and numerical learning approaches. VISE employs KGE to capture implicit information and represent negation in KGs, thereby enhancing the prediction performance of numerical models. The experimental results demonstrate the efficacy of this hybrid technique, which effectively integrates the strengths of symbolic, numerical, and constraint validation paradigms.

This collection includes all the data necessary to reproduce the results from the experimental evaluation of VISE at EXPLIMED @ ECAI'24. The data is an anonymized synthetic lung cancer benchmark that comprises clinical data extracted from heterogeneous sources such as publications, clinical trials, and clinical records representing patients diagnosed with lung cancer. We evaluate the VISE approach on three anonymized Lung Cancer KGs: LC-𝐾𝐺1, LC-𝐾𝐺2,and LC-𝐾𝐺3.

The collection comprises nine data sets of three different sizes:

  • LC Knowledge Graph 1 (LC-KG1) models 29 lung cancer patients
  • LC Knowledge Graph 2 (LC-KG2) models 203 lung cancer patients
  • LC Knowledge Graph 3 (LC-KG3) models 319 lung cancer patients

Three distinct KGs of different sizes are available, each with its own characteristics.

  • "Original KG": The original KG comprises of anonymized lung cancer patients with different medical characteristics.
  • "Enriched KG": Utilizes an inductive learning technique of KG completion through self-supervised symbolic learning over the original KG.
  • "Transformed KG": Denotes a transformation of the KG depending on SHACL shapes evaluated across the enriched KGs. This procedure is used to determine the validity of the data.

VISE is also evaluated with KGs comprising 1242 lung cancer patients (LungCancer-OriginalKG, LungCancer-EnrichedKG, and LungCancer-TransformedKG).

BibTex: