Born Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier's Decision

20 Nov 2020 · Kwanseok Oh, Jee Seok Yoon, Heung-Il Suk ·

There exists an apparent negative correlation between performance and interpretability of deep learning models. In an effort to reduce this negative correlation, we propose a Born Identity Network (BIN), which is a post-hoc approach for producing multi-way counterfactual maps. A counterfactual map transforms an input sample to be conditioned and classified as a target label, which is similar to how humans process knowledge through counterfactual thinking. For example, a counterfactual map can localize hypothetical abnormalities from a normal brain image that may cause it to be diagnosed with a disease. Specifically, our proposed BIN consists of two core components: Counterfactual Map Generator and Target Attribution Network. The Counterfactual Map Generator is a variation of conditional GAN which can synthesize a counterfactual map conditioned on an arbitrary target label. The Target Attribution Network provides adequate assistance for generating synthesized maps by conditioning a target label into the Counterfactual Map Generator. We have validated our proposed BIN in qualitative and quantitative analysis on MNIST, 3D Shapes, and ADNI datasets, and showed the comprehensibility and fidelity of our method from various ablation studies.

PDF Abstract