More transparency for AI in diagnostics

- EN - DE
PD Dr. Tobias Lasser and Alessandro Wollek (front) are comparing a TMME saliency
PD Dr. Tobias Lasser and Alessandro Wollek (front) are comparing a TMME saliency map (left screen) with a GradCAM saliency map (right screen). Bild: Carolin Lerch / TUM

At this point the Google Custom Search Engine is integrated. When displaying, data may be transferred to third parties or cookies may be stored, therefore your consent is required. You can find more information and the possibility to revoke your consent at Privacy

, Medizinische Bildverarbeitung & KI, News

Interview with computer scientist Alessandro Wollek and ethics researcher Theresa Willem

Artificial Intelligence (AI) has the potential to support diagnoses in radiology. However, until now, a lack of transparency has often made it difficult to understand the recommendations made by AI. Researchers have now investigated whether and how the visual representations used in AI image analysis - referred to as saliency maps - can help. In this interview Alessandro Wollek, doctoral candidate in the field of computer science, and Theresa Willem, doctoral candidate in Medical Ethics, explain how their results can help make AI more transparent.

Mr. Wollek, what is the role of AI in radiological diagnostics?

Wollek: AI could support radiologists to make faster and more certain diagnoses. This would be especially useful in stress situations, for radiologists with as yet little practical experience and it could help create more time for consultation.

Where is the difficulty?

Willem: The AI decision-making process has to be transparent enough to physicians for them to make the best possible assessment of how far they can trust the AI recommendation. That has been difficult up to now. This was our point of departure, we wanted to find out how to render the process more transparent.

What was the initial situation like for your study?

Wollek: Up to now, radiology has used what are called Convolutional Neural Network (CNN) algorithms. Other fields have already been using the newer Vision Transformer algorithms (ViT). Although these are as a rule better, i.e. make fewer mistakes in diagnosis, they also require enormous amounts of data. And since data sets of this order of magnitude are not available in radiology, ViT algorithms haven’t been used there in the past.

Using the example of pneumothorax, we investigated for radiological image analysis whether or not ViT algorithms work just as well as CNN algorithms, in spite of the relatively small data sets. And we were able to confirm that they do: ViTs provided results comparable to those of CNNs for all the parameters we measured.

What benefit will now result from being able to use Vision Transformers as well?

Wollek: You can create a kind of map for every recommendation made by the AI system showing which image segments the AI system included in its decision-making process. You can use a certain type of map for ViTs which doesn’t exist for CNNs and which we assumed would be more useful.

These maps, referred to as saliency maps, represent the areas within an image on which the neural network has based its prediction. There are various kinds of maps: Gradient-weighted Class Activation Mapping (GradCAM) and Transformer-Multi-Modal-Explainability (TMME) maps. Only the GradCAM model can be used for CNN algorithms, whereas ViT algorithms can also use TMME. Now for the first time we were able to use TMME saliency maps in radiology, and, as a result, we were able to compare both types of saliency maps in terms of their quality.

Which type of map was better? And according to which criteria?

Wollek: We began with the question: Do the saliency maps really reliably show what the AI system uses for the recommendation? It would also be possible that the maps were defective and emphasized image segments which might at first appear plausible to us as observers, but which didn’t really correspond to the areas on which the AI system has based its statement. In order to investigate that, we evaluated both saliency maps in terms of quantitative criteria. For all the criteria we measured, the TMME saliency maps performed better; simply put, they were more reliable when it came to showing which components of an image the AI system based its diagnosis recommendation on.

How do these results help radiologists?

Willem: In our pilot study three radiologists each assessed 70 x-ray images with TMME and 70 x-ray images with GradCAM saliency maps. They found 47 percent of TMME saliency maps helpful in diagnosis, but only 39 percent of the GradCAM saliency maps.

How can these findings be used in the future?

Willem: We will soon be launching another study with a significantly larger number of radiologists. If the results confirm our findings, saliency maps could be used in the future in AI-driven diagnostics. We hope we can make AI diagnosis recommendations a little more transparent for use in clinical practice.

Alessandro Wollek, Robert Graf, SaĻa Cecatka, Nicola Fink, Theresa Willem, Bastian O. Sabel, and Tobias Lasser: Attention-based Saliency Maps Improve Interpretability of Pneumothorax Classification, Radiology: Artificial Intelligence (2023), DOI https://doi.org/10.1148/r­yai.220187

Scientific Contact

PD Dr. Tobias Lasser
Technical University of Munich
89 289 10807
lasser@cit.tum.de
https://ciip.in.tum.de/