-
ConceptDistil – 2-staged
ConceptDistil is a method to bring concept explanations to any black-box classifier using knowledge distillation. -
ConceptDistil – No gradient
ConceptDistil is a method to bring concept explanations to any black-box classifier using knowledge distillation. -
Teaching the machine to explain itself using domain knowledge
ConceptDistil is a method to bring concept explanations to any black-box classifier using knowledge distillation. -
ConceptDistil: Model-Agnostic Distillation of Concept Explanations
ConceptDistil is a method to bring concept explanations to any black-box classifier using knowledge distillation. -
Do Models Explain Themselves?
Do models explain themselves? counterfactual simulatability of natural language explanations -
E-WNC: Explainable Subjective Bias Style Transfer
We build two explainable style transfer datasets by augmenting existing datasets with synthetic textual explanations generated by a teacher model. -
E-GYAFC: Explainable Formality Style Transfer
We build two explainable style transfer datasets by augmenting existing datasets with synthetic textual explanations generated by a teacher model. -
ICLEF: In-Context Learning with Expert Feedback for Explainable Style
We build two explainable style transfer datasets by augmenting existing datasets with synthetic textual explanations generated by a teacher model. -
Challenge 3 - Blue and Yellow Circles
Challenge 3 - Blue and Yellow Circles: Set of all possible Kandinsky Figures consist of equal size blue and yellow circles. -
Challenge 2 - Nine Circles
Challenge 2 - Nine Circles: Set of Kandinsky Figures consist of 9 circles arranged in a regular grid. -
Challenge 1 - Objects and Shapes
Challenge 1 - Objects and Shapes: Ground truth gt(k) is defined as 'in a Kandinsky Figure small objects are arranged on big shapes same as object shapes, in the big shape of... -
Kandinsky Patterns
Kandinsky Patterns are mathematically describable, simple self-contained hence controllable test data sets for the development, validation and training of explainability in... -
Visual Classification as Linear Combination of Words
Explainability is a longstanding challenge in deep learning, especially in high-stakes domains like healthcare. Common explainability methods highlight image regions that drive... -
Hatexplain: A Benchmark Dataset for Explainable Hate Speech Detection
The HateXplain dataset is a benchmark dataset for explainable hate speech detection. -
ExplainFix: Explainable Spatially Fixed Deep Networks
ExplainFix adopts two design principles: the “fixed filters” principle that all spatial filter weights of convolutional neural networks can be fixed at initialization and never... -
Manifold Hypothesis for Gradient-Based Explanations
The dataset used in the paper is a collection of images from various sources, including MNIST, EMNIST, CIFAR10, X-ray pneumonia, and Diabetic Retinopathy detection.