Robust Attribution Regularization
Recent work on training neural networks to have robust attributions, to improve their trustworthiness and resilience to adversarial attacks.…
Recent work on training neural networks to have robust attributions, to improve their trustworthiness and resilience to adversarial attacks.…
In recent work with my colleagues at CMU, we focus on bringing greater transparency to these mysterious, yet effective, machine learning techniques.…
This post will delve deeper into Path Integrated Gradient Methods for Feature Attribution in Neural Networks.…
In this third post of the series on Explainability in Neural Networks, we present Axioms of Attribution, which are a set of desirable properties that any reasonable feature-attribution method should have.…
We examine some simple, intuitive methods to explain the output of a neural network (based on perturbations and gradients), and see how they produce non-sensical results for non-linear functions.…
The wild success of Deep Neural Network (DNN) models in a variety of domains has created considerable excitement in the machine learning community. Despite this success, a deep understanding of why DNNs perform so well, and whether their performance is somehow brittle, has been lacking.…