Building Trust in AI for Healthcare: Interpretability Is Not Enough

In my previous post I discussed bias in applications of machine learning and AI for healthcare, and new methods that are being developed to detect and mitigate it. Preventing bias is essentially about ensuring that factors like race and gender are not driving any decisions inappropriately. A closely related topic is the ability to understand which factors do drive decisions, and how they do so. This question is known as “interpretability”.

Interpretability is a growing field in machine learning, partly because deep learning methods, which have revolutionized image and text processing, are generally considered to be highly uninterpretable. Making them more interpretable is an important research goal.

Interpretability is regarded as especially critical in healthcare applications. Its importance is justified by the refrain “how can a clinician be expected to act on a model’s recommendation without understanding why the recommendation was made in the first place?” This is a sensible concern given known cases of machine learning models making wrong decisions in high-stakes domains like criminal justice and environmental protection.

Clinicians (and other humans) follow recommendations when they trust them. Understanding the reasoning behind a recommendation is just a method, albeit a powerful one, to build trust. For this reason, interpretability of machine learning models is often conflated with trust in them.

But in fact, interpretability isn’t always required for trust. All of us trust many technological methods and devices despite not fully understanding how they work, revealing our trust by using them daily. For example, you may not understand exactly how email works, but you trust that your emails will make it to the right inbox. This is an example of trust bred by familiarity rather than understanding.

To complicate things further, some modeling methods, like decision trees, are considered highly interpretable even though they can be very difficult for a human to make sense of: try to understand the image below (taken from this paper after a casual search for “clinical decision tree”.)

So if interpretability on its own is not sufficient, or necessarily required, to build trust in machine learning models, what is? Trust is a complex feeling that is influenced by many factors. There is a large body of literature identifying key elements of trust, both in other humans in a professional setting and in technology. One of these factors is transparency, which for machine learning models is a direct consequence of interpretability. Another factor is competence, which is related to performance metrics such as a model’s accuracy.

But other factors are just as important in building trust, among them respectfulness and honesty. It may be difficult to think of a machine learning application as being respectful or honest. But the interaction between a human and a model can feel more or less respectful of the human’s expertise (what type of justification is needed to override a model’s recommendation?) or more or less honest about the model’s limitations (how is uncertainty about a prediction communicated?)

Model interpretability is an important factor in building trust. It can contribute greatly to the success of machine learning applications in healthcare settings. But an exclusive focus on interpretability and performance metrics risks missing equally important elements of trust building. These can only be addressed by a thoughtful consideration of the interaction between humans and models. This is a critical component for the success of any machine learning project in healthcare, which requires a close collaboration between data scientists, product managers, and clinical domain experts.

Not on the mailing list? Subscribe to the newsletter!