A team from the University of Washington has reviewed multiple AI models that have been put forward as potential tools for detecting Covid-19 in patients and found that these models rely on “fast learning” to arrive at their conclusions.
When trained to detect disease, AI tools cannot recognize clinically significant indicators and instead look for shortcuts, such as – in one infamous case – the appearance of a ruler in images of skin cancer.
In the latter case, looking at coronavirus, as described in Nature Machine Intelligence, the models used features such as text markings or patient positioning specific to each dataset to detect the presence of the novel coronavirus.
“A physician would generally expect an X-ray finding of Covid-19 to be based on specific patterns in the image that reflect disease processes,” said study co-lead author Alex DeGrave, a PhD student.
“For example, instead of relying on those patterns, a system using shortcut learning could judge that a person is older and thus conclude that the disease is more likely because it is more common in older patients.
“The shortcut is not necessarily wrong, but the association is unexpected and not transparent.
This can lead to an incorrect diagnosis. Rapid learning, while not technically inaccurate, is not robust and is likely to cause the model to fail outside its original setting.
This can make the AI tool a serious problem, especially given the opacity associated with AI decision-making ( how a tool makes predictions is often considered a “black box”) DeGrave explains, “A model that relies on shortcuts often only works in the hospital in which it was developed, so if you take the system to a new hospital , it fails, and that failure can point doctors to the wrong diagnosis and improper treatment.
“Explainable” AI techniques allow researchers to explain in detail how different inputs and their weights contribute to a model’s output, illuminating the black box.
DeGrave and his colleagues used these approaches to evaluate the reliability of AI models proposed for identifying cases of Covid-19 based on chest X-rays, which were found to produce good results.
The team reasoned that these models would be prone to a condition known as “worst-case confounding,” due to the lack of training data available for a disease as new as Covid-19;
this increased the likelihood that the models would rely on shortcuts rather than learn the underlying pathology of the disease from the data.
“Confusion at its worst causes an AI system to simply learn to recognize data sets instead of learning real disease pathology,” said co-lead author Joseph Janizek, who is also a PhD candidate.
“It’s what happens when all the positive cases of Covid-19 are from a single dataset, while all the negative cases are in another.
And while researchers have devised techniques to reduce associations like this in cases where those associations are less severe, these techniques don’t work in situations where you have a perfect relationship between an outcome like Covid-19 status and a factor like the data source .
The researchers trained multiple deep convolutional neural networks on X-rays of a dataset that replicated the approach used in the published papers.
First, they tested the performance of each model on an internal set of images of that initial dataset that had been memorized from the training data.
They then tested how well the models performed on a second dataset intended to represent new hospital systems.
Although the models performed well when tested on images from the first dataset, their accuracy was halved when tested on images from the first dataset.
The researchers then applied explainable AI techniques to identify which image features most strongly influenced predictions and found that shortcuts were taken using cues such as patient positioning and text markings on the images.
i are now g increasingly optimistic about the clinical viability of AI for medical imaging. I believe we will eventually have reliable ways to stop AI from learning shortcuts, but it will take a little more work to get there, “said Professor Su-In Lee.
to ensure that these models can be used safely and effectively to enhance medical decision-making and achieve better outcomes for patients.”