A Two-Stage Generative Adversarial Networks With Semantic Content Constraints for Adversarial Example Generation

IEEE 2020 · Jianyi Liu, Yu Tian, Ru Zhang, (Member, IEEE), YOUQIANG SUN, AND CHAN WANG ·

Deep neural networks (DNNs) have achieved great success in various applications due to their strong expressive power. However, recent studies have shown that DNNs are vulnerable to adversarial examples, and these manipulated instances can mislead DNN into making false predictions. The existing methods of generating adversarial examples include pixel-level perturbation or spatial transformation of images, which cannot consider concurrently with the semantic quality of adversarial examples or success rate of attack. These methods are computationally bulky and slow to generate the adversarial examples. To solve this kind of issue, a two-stage generative adversarial networks (TSGAN) with semantic content constraints is proposed in this paper. The first-stage uses the original example dataset to train generator G, which can help the generator learn the distribution of real examples. Then, the example semantic quality constraint loss function, the adversarial loss function and the distance loss function are adopted in the second-stage, so that the generator G can continue to learn to search the distribution of the adversarial examples, and train the new generator Gadv. The adversarial examples generated by generator Gadv are better fit the distribution of real examples, and have targeted black-box attack capability. The experiments show that the adversarial examples generated by TSGAN can achieve the success rate of attack at 98.40% in target model, 29.40% success rate in defense-oriented model. And 77.58% success rate is obtained in the transfer test attack. The results show that the adversarial examples generated by the proposed model, which has a highly attack success rate and more difficult to defense. Meanwhile, the improved adversarial examples have stronger transfer ability than the existing models. The proposed model can effectively reduce the expression of target category features of the adversarial examples, and the generated adversarial examples have better semantic quality than others.

PDF Abstract