How to deal with the AI interpretability crisis, which should be more concerned about interpretation and verification?

2022-03-25 21:32:18

Much of the current boom in the application of artificial intelligence comes from the technological development of machine learning, especially deep learning, but the inexplicable unexplainability behind intelligence has always made people question whether artificial intelligence must be interpretable to be used in some high-risk scenarios, especially in key areas such as healthcare, finance and government.

Deep learning, or Deap Neural Networks, is characterized by the ability to autonomously learn from large amounts of data and build a system of rules without the need for human intervention. However, between the input data and the output results of AI deep learning models, in artificial neural networks at the level of complex structures, there are a large number of codes and values that are difficult for humans to understand, and it is impossible to accurately explain the reasons why AI makes specific predictions in specific situations.

This is the "AI explainability crisis" that many people have heard of.

On March 23, Fortune pointed out in an article that there is an interpretability crisis in AI, but it may not be what you think it is. At the end of the article, it expresses the view that "when it comes to artificial intelligence in the real world, what we should care about is not interpretation but verification".

So, what's wrong with the pursuit of explainable AI?

"We believe that the desire to build trust through current interpretability methods represents a false hope that individual users or users affected by AI will be able to judge the quality of AI decisions by reviewing interpretations (i.e., explanations specific to that individual decision). Those using such systems may have misunderstood the capabilities of contemporary interpretability technologies — they can produce broad descriptions of how AI systems work in a general sense, but for individual decisions, these explanations are unreliable or, in some cases, can only provide a superficial level of explanation. Recently, Marzyeh Ghassemi, a computer scientist at the Massachusetts Institute of Technology, Luke Oakden-Rayner, a radiologist and researcher at the Australian Machine Learning Institute, and Andrew Beam, a researcher in the Department of Epidemiology at the Harvard School of Public Health, wrote in a paper published in the medical journal Lancet Digital Health.

How to deal with the AI interpretability crisis, which should be more concerned about interpretation and verification?

Attempts to produce humanly intelligible explanations for machine learning decisions generally fall into two categories: intrinsic explainability and ex post facto explainability.

For machine learning models with limited input data complexity and ease of understanding, quantifying the relationship between these simple inputs and the output of the model is called inherent interpretability. For example, train an AI from the beginning to recognize the archetypal features of a disease, like the presence of a "ground glass" pattern in the lungs, and then tell the doctor how closely it thinks the examined image matches the prototype.

This may seem intuitive and simple, but the authors found that it also depends heavily on the human interpretation — whether the correct prototype features were chosen and each feature was weighted appropriately when concluding.

Even inherently interpretable models can be difficult to really work with because of unrecognizable confounding factors. Not to mention that in many modern AI use cases, data and models are too complex and high-dimensional to be explained by the simple relationship between inputs and outputs.

The ex post facto interpretability idea is to dissect its decision-making process through various avenues. A popular form of ex post facto interpretability called heat maps, which highlight the extent to which each region of the image contributes to a given decision and is illustrative and commonly used in medical imaging models.

Illustration: Heat maps generated by post-interpretation methods of deep learning models used to detect chest X-ray pneumonia

(According to deep neural networks, a brighter red color indicates an area with a higher importance level, and a darker blue color indicates an area with a lower importance level)

But Ghassemi et al. found that heat maps that should explain why AI classified patients as pneumonia, even the "hottest areas" (the most influential judgment areas) in the graph contained information that doctors would see as useful and useless, and simply locating the area did not accurately reveal the exact content of the area that the model considered useful.

"Clinicians do not know whether the model properly determines whether the presence of cloudiness in the airspace is important in the decision, whether the shape of the heart boundary or left pulmonary artery is a determining factor, or whether the model relies on features that are not relevant to humans, such as specific pixel values or textures, that may be related to the image acquisition process rather than the underlying disease," Ghassemi, Oakden-Rayner, and Beam wrote.

In the absence of such information, they note, humans tend to assume that AI is studying important traits that human clinicians will find. This cognitive bias can blind doctors to the mistakes that machine learning algorithms can make.

The researchers also found flaws in other popular interpretability methods, such as GradCam, LIME, and Shapley Values. Some of these methods change the input data points until the algorithm makes different predictions, and then assumes that these data points must be the most important for the predictions originally made.

But these methods have the same problems with heat maps — they may identify features that are important for decision-making, but they can't tell doctors exactly why the algorithm thinks those features are important. What should a doctor do if this trait makes the doctor feel counterintuitive? Is the algorithm wrong, or is it concluding that it has uncovered clinically important clues that were previously unknown to medicine? Either one is possible.

To make matters worse, different and recent interpretation methods often diverge on the interpretation of the algorithm's conclusions. In a Feb. 8 paper, "Divergence Problems in Explainable Machine Learning: Practitioner Perspectives," researchers from Harvard, MIT, Carnegie Mellon, and Drexel found that in the real world, most people who use algorithms can't resolve these differences, and often, as scholars like Ghassemi and others have suggested, they simply choose the explanation that best fits their existing ideas.

Zachary Lipton, a professor of computer science at Carnegie Mellon University, said in an interview with Fortune, "Every serious person in the healthcare space knows that most of today's explainable AI is nonsense. Lipton said that after deploying a supposedly interpretable AI system at their hospital to interpret medical images, there have been many radiologists who have asked him for help, and the interpretation of these images has no meaning — or at least, has nothing to do with what the radiologists really want.

However, companies continue to market their AI systems as "explainable," Lipton said, because they think they have to do that to make a sale, "they say, 'Doctors won't believe it without an explanation.'" But maybe they shouldn't believe it. ”

According to a 2020 study published in the British Medical Journal (The BMJ), in the worst case, the explanation provided is to mask that most of the deep learning algorithms used in medical imaging are not subject to rigorous double-blind randomized controlled trials that are required before new drugs can be approved.

"We recommend that end users of interpretable AI, including clinicians, legislators, and regulators, be aware of the limitations of interpretable AI that currently exists. We believe that if we want to ensure that AI systems can operate safely and reliably, then the focus should be on rigorous and thorough verification procedures. Ghassemi, Oakden-Rayner, and Beam came to a somewhat counterintuitive conclusion that doctors shouldn't focus on explanation, but on how AI works and whether it's rigorously, scientifically tested.

They point out that medicine is full of drugs and techniques used by doctors because they work, although no one knows why — acetaminophen has been used to treat pain and inflammation for a century, although we still don't fully understand its underlying mechanisms.

How to deal with the AI interpretability crisis, which should be more concerned about interpretation and verification?

Read on