medXGAN: Visual Explanations for Medical Classifiers through a Generative Latent Space

Amil
Dravid

Florian
Schiffers

Boqing
Gong

Aggelos K.
Katsaggelos

[Paper]

[GitHub]

[Demos]

Overview. We propose a GAN framework medXGAN that explicitly disentangles anatomical structure and classifier-specific pathology, respectively. After training with a fixed classifier providing feedback, the generator can be used to explain the classifier’s decision with fine-grained detail. Given a ground truth positive image, the latent code can be found via an optimization scheme. The positive image then can be turned into a negative realization. Additionally, we propose using a negative realization as a baseline for integrated gradients while interpolating in the latent space rather than pixel space. This so-called Latent Integrated Gradients (LIG) is robust to noise and edges.

[Slides]

Abstract

Despite the surge of deep learning in the past decade, some users are skeptical to deploy these models in practice due to their black-box nature. Specifically, in the medical space where there are severe potential repercussions, we need to develop methods to gain confidence in the models' decisions. To this end, we propose a novel medical imaging generative adversarial framework, medXGAN (medical eXplanation GAN), to visually explain what a medical classifier focuses on in its binary predictions. By encoding domain knowledge of medical images, we are able to disentangle anatomical structure and pathology, leading to fine-grained visualization through latent interpolation. Furthermore, we optimize the latent space such that interpolation explains how the features contribute to the classifier's output. Our method outperforms baselines such as Gradient-Weighted Class Activation Mapping (Grad-CAM) and Integrated Gradients in localization and explanatory ability. Additionally, a combination of the medXGAN with Integrated Gradients can yield explanations more robust to noise.

Example Visualizations

Negative to Positive Latent Interpolation

A simple pixel-wise difference between positive and negative realizations, or latent integrated gradients (LIG), provide more detailed explanations. Additional visualizations, quantitative results, and additional experiments are available in the paper.

[Bibtex]

Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.