IEEE Paper akzeptiert
Theresa Bender (1); Jacqueline M. Beinecke (1); Dagmar Krefting (1); Carolin Müller (2); Henning Dathe (1); Tim Seidler (2); Nicolai Spicher (1); Anne-Christin Hauschild (1)
(1) Department of Medical Informatics, UMG
(2) Department for Cardiology & Pneumology/Heart Center, UMG
Despite their remarkable performance, deep neural networks remain unadopted in clinical practice, which is considered to be partially due to their lack in explainability. In this work, we apply attribution methods to a pre-trained deep neural network (DNN) for 12-lead electrocardiography classification to open this "black box" and understand the relationship between model prediction and learned features. We classify data from a public data set and the attribution methods assign a "relevance score" to each sample of the classified signals. This allows analyzing what the network learned during training, for which we propose quantitative methods: average relevance scores over a) classes, b) leads, and c) average beats. The analyses of relevance scores for atrial fibrillation (AF) and left bundle branch block (LBBB) compared to healthy controls show that their mean values a) increase with higher classification probability and correspond to false classifications when around zero, and b) correspond to clinical recommendations regarding which lead to consider. Furthermore, c) visible P-waves and concordant T-waves result in clearly negative relevance scores in AF and LBBB classification, respectively.
In a nutshell: In summary, our analysis suggests that the DNN learned features similar to cardiology textbook knowledge.
Journal: IEEE Journal of Biomedical and Health Informatics, 2023, Journal article
Link to full text: https://doi.org/10.1109/jbhi.2023.3271858
Link to the author's ORCiD: https://orcid.org/0000-0001-6721-7034