重庆工商大学学报（自然科学版）

引用本文:	曾繁茂,方贤进.基于替代模型的黑盒迁移攻击方法(J/M/D/N,J:杂志，M：书，D：论文，N：报纸).期刊名称,2025，42（3）：70-76
	CHEN X. Adap tive slidingmode contr ol for discrete2ti me multi2inputmulti2 out put systems[ J ]. Aut omatica, 2006, 42(6): 4272-435

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 904次下载 708次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于替代模型的黑盒迁移攻击方法
曾繁茂,方贤进
安徽理工大学计算机科学与工程学院,安徽淮南 232001

摘要:

目的针对现有基于生成对抗网络的无数据黑盒攻击方法容易出现收敛缓慢和代价高昂的问题,提出一种新颖的黑盒迁移攻击方法。方法分为两个阶段:训练数据合成与替代模型蒸馏。在训练数据合成阶段,通过优化生成器以最大化替代模型与目标模型输出的一致性,同时引入 2 种损失函数来约束生成器产生的数据分布;在替代模型蒸馏阶段,采用具有可学习参数的残差块设计替代模型,并利用生成器合成的数据来拟合目标模型的决策边界。通过交替进行这两个阶段的训练,替代模型可以更好地拟合目标模型的决策边界,进而提升攻击效果。结果通过系列实验验证,针对目标模型的无目标黑盒攻击成功率可以达到 70%以上;在 CIFAR100 数据集上,该方法相较于其他黑盒攻击方法,有目标攻击成功率提高了 2%以上,且在实现相同攻击效果时,所需查询预算更低。结论所提方法能够高效拟合目标模型的决策边界,具有较好的攻击效果。

关键词: 对抗样本黑盒攻击迁移攻击替代模型蒸馏

DOI：

分类号:

基金项目:

Black-box Transfer Attack Method Based on Substitute Models

ZENG Fanmao, FANG Xianjin

School of Computer Science and Engineering Anhui University of Science and Technology Anhui Huainan 232001 China

Abstract:

Objective To solve the problems of slow convergence and high cost of the existing data-free black-box attack methods based on generative adversarial networks a novel black-box transfer attack method is proposed. Methods The method consists of two stages training data synthesis and substitute model distillation. In the stage of training data synthesis the generator is optimized to maximize the consistency between the outputs of the substitute model and the target model and two loss functions are introduced to constrain the data distribution generated by the generator. In the stage of substitute model distillation a substitute model is designed with residual blocks containing learnable parameters and data synthesized by the generator is used to fit the decision boundary of the target model. By alternating between these two stages of training the substitute model can better fit the decision boundary of the target model thereby enhancing the attack effectiveness. Results Through a series of experiments the success rate of non-targeted black-box attacks against the target model exceeded 70%. On the CIFAR100 dataset compared with other black-box attack methods the success rate of targeted attacks increased by more than 2% and the required query budget was lower for achieving the same attack effect. Conclusion The proposed method efficiently fits the decision boundary of the target model and demonstrates good attack effectiveness.

Key words: adversarial examples black-box attack transfer attack substitute model distillation