著者
Hyun KWON Changhyun CHO Jun LEE
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E105-D, no.11, pp.1880-1889, 2022-11-01
被引用文献数
2

Deep neural networks (DNNs) provide excellent services in machine learning tasks such as image recognition, speech recognition, pattern recognition, and intrusion detection. However, an adversarial example created by adding a little noise to the original data can result in misclassification by the DNN and the human eye cannot tell the difference from the original data. For example, if an attacker creates a modified right-turn traffic sign that is incorrectly categorized by a DNN, an autonomous vehicle with the DNN will incorrectly classify the modified right-turn traffic sign as a U-Turn sign, while a human will correctly classify that changed sign as right turn sign. Such an adversarial example is a serious threat to a DNN. Recently, an adversarial example with multiple targets was introduced that causes misclassification by multiple models within each target class using a single modified image. However, it has the weakness that as the number of target models increases, the overall attack success rate decreases. Therefore, if there are multiple models that the attacker wishes to attack, the attacker must control the attack success rate for each model by considering the attack priority for each model. In this paper, we propose a priority adversarial example that considers the attack priority for each model in cases targeting multiple models. The proposed method controls the attack success rate for each model by adjusting the weight of the attack function in the generation process while maintaining minimal distortion. We used MNIST and CIFAR10 as data sets and Tensorflow as machine learning library. Experimental results show that the proposed method can control the attack success rate for each model by considering each model's attack priority while maintaining minimal distortion (average 3.95 and 2.45 with MNIST for targeted and untargeted attacks, respectively, and average 51.95 and 44.45 with CIFAR10 for targeted and untargeted attacks, respectively).