Explanations for Attributing Deep Neural Network Predictions

Ruth Fong, Andrea Vedaldi

Research output: Chapter in Book/Report/Conference proceedingChapter

42 Scopus citations

Abstract

Given the recent success of deep neural networks and their applications to more high impact and high risk applications, like autonomous driving and healthcare decision-making, there is a great need for faithful and interpretable explanations of “why” an algorithm is making a certain prediction. In this chapter, we introduce 1. Meta-Predictors as Explanations, a principled framework for learning explanations for any black box algorithm, and 2. Meaningful Perturbations, an instantiation of our paradigm applied to the problem of attribution, which is concerned with attributing what features of an input (i.e., regions of an input image) are responsible for a model’s output (i.e., a CNN classifier’s object class prediction). We first introduced these contributions in [8]. We also briefly survey existing visual attribution methods and highlight how they faith to be both faithful and interpretable.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages149-167
Number of pages19
DOIs
StatePublished - 2019
Externally publishedYes

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11700 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Keywords

  • Computer vision
  • Explainable artificial intelligence
  • Machine learning

Fingerprint

Dive into the research topics of 'Explanations for Attributing Deep Neural Network Predictions'. Together they form a unique fingerprint.

Cite this