From the Perspective of AI Safety: Analyzing the Impact of XAI Performance on Adversarial Attack

Abdullah Mohamed Asiri; Feng Wu; Zhiyi Tian; Shui Yu

doi:10.1109/globecom52923.2024.10901342

Abstract

1 min read

The outstanding performance of machine learning models across various fields enables them to be used in sensitive and high-risk activities such as healthcare, automated driving, and security services. However, the explainability of their outputs and the safety of their operation are major concerns. Although Explainable AI (XAI) can enhance the interpretability of AI models, further research is necessary to evaluate its effectiveness in explaining adversarial attacks. The use of XAI techniques and the rise of adversarial attacks are important issues related to the explainability and security of deep learning, respectively. Furthermore, the relation between the explainability and safety of deep learning, such as the vulnerability of the XAI explanation to an adversarial attack, is key to unlocking these concerns. In this paper, we use the Saliency Map as an XAI technique to explain the behavior of the Fast Gradient Sign Method (FGSM) adversarial attack on the ResNet model, and show the vulnerability related to such an explanation with respect to the attack. Extensive experiments show that as the severity of FGSM attack on the ResNet model increases, the Saliency Map gradually fails, exposing its potential vulnerability.

From the Perspective of AI Safety: Analyzing the Impact of XAI Performance on Adversarial Attack

Abstract

Discussion(0)

Related publications

Adversarial Attack for SAR Target Recognition Based on UNet-Generative Adversarial Network

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Related publications

Article2021
Adversarial Attack for SAR Target Recognition Based on UNet-Generative Adversarial Network
Article2021

Article2021
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain
Article2021

Preprint2020
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain
Preprint2020

Preprint2020
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain
Preprint2020

Preprint2020
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain
Preprint2020