留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于LIME的恶意代码对抗样本生成技术

黄天波 李成扬 刘永志 李燈辉 文伟平

黄天波, 李成扬, 刘永志, 等 . 基于LIME的恶意代码对抗样本生成技术[J]. 北京航空航天大学学报, 2022, 48(2): 331-338. doi: 10.13700/j.bh.1001-5965.2020.0397
引用本文: 黄天波, 李成扬, 刘永志, 等 . 基于LIME的恶意代码对抗样本生成技术[J]. 北京航空航天大学学报, 2022, 48(2): 331-338. doi: 10.13700/j.bh.1001-5965.2020.0397
HUANG Tianbo, LI Chengyang, LIU Yongzhi, et al. Adversarial sample generation technology of malicious code based on LIME[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 331-338. doi: 10.13700/j.bh.1001-5965.2020.0397(in Chinese)
Citation: HUANG Tianbo, LI Chengyang, LIU Yongzhi, et al. Adversarial sample generation technology of malicious code based on LIME[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 331-338. doi: 10.13700/j.bh.1001-5965.2020.0397(in Chinese)

基于LIME的恶意代码对抗样本生成技术

doi: 10.13700/j.bh.1001-5965.2020.0397
基金项目: 

国家自然科学基金 61872011

详细信息
    通讯作者:

    文伟平, E-mail: weipingwen@ss.pku.edu.cn

  • 中图分类号: TP309

Adversarial sample generation technology of malicious code based on LIME

Funds: 

National Natural Science Foundation of China 61872011

More Information
  • 摘要:

    基于机器学习检测恶意代码技术的研究和分析,针对机器学习模型对抗样本的生成提出一种基于模型无关的局部可解释(LIME)的黑盒对抗样本生成方法。该方法可以对任意黑盒的恶意代码分类器生成对抗样本,绕过机器学习模型检测。使用简单模型模拟目标分类器的局部表现,获取特征权重;通过扰动算法生成扰动,根据生成的扰动对原恶意代码进行修改后生成对抗样本;基于2015年微软公布的常见恶意样本数据集和收集的来自50多个供应商的良性样本数据对所提方法进行实验,参照常见恶意代码分类器实现了18个基于不同算法或特征的目标分类器,使用所提方法对目标分类器进行攻击,使分类器的真阳性率均降低到接近0。此外,对MalGAN和ZOO两个先进的黑盒对抗样本生成方法与所提方法进行对比,实验结果表明:所提方法能够有效生成对抗样本,且方法本身具有适用范围广泛、能灵活控制扰动和健全性的优点。

     

  • 图 1  基于LIME的恶意代码分类器对抗样本生成过程

    Figure 1.  Adversarial sample generation process of malicious code classifier based on LIME

    图 2  三种特征的Dr-TPR图和Dr-ASR图

    Figure 2.  Dr-TPR and Dr-ASR of three characteristics

    表  1  实验的硬件、软件环境

    Table  1.   Hardware and software environment of experiment

    软/硬件环境 具体信息
    硬件环境 内存:16 GB
    CPU:Inter(R) Core(TM)i7-8550U
    软件环境 IDA pro 7.0
    lime 0.1.1.37
    keras 2.3.1
    Python 3.6.3
    tensorflow 1.15.0
    numpy 1.18.1
    sklearn 0.20.0
    adversarial-robustness-toolbox 1.1.1
    下载: 导出CSV

    表  2  评估指标

    Table  2.   Evaluation indicators

    评估指标 公式
    真阳性率TPR TPR=TP/(TP+FN)
    准确率ACC ACC=(TP+TN)/(TP+TN+FP+FN)
    攻击成功率ASR ASR=1-TPRbefore/TPRafter
    下载: 导出CSV

    表  3  目标分类器设置

    Table  3.   Target classifier setting

    算法 API opc-2gram opc-3gram
    LR #1 #2 #3
    RF #4 #5 #6
    SVM #7 #8 #9
    MLP1 #10 #11 #12
    MLP2 #13 #14 #15
    MLP3 #16 #17 #18
    下载: 导出CSV

    表  4  API特征分类器的真阳性率对比

    Table  4.   API feature classifier and TPR comparison

    对抗样本生成方法 TPR/%
    #1-LR #4-RF #7-SVM #10-MLP1 #13-MLP2 #16-MLP3
    无攻击对照 89.44 98.89 95.38 92.22 95.56 95.00
    ZOO 57.78 91.11 61.67 69.44 72.78 70.56
    MalGAN 0 1.67 0.56 0 0 0
    本文方法 0 0 1.67 0 1.11 0
    下载: 导出CSV

    表  5  两种方法生成的对抗样本平均扰动维度

    Table  5.   Average perturbation dimension of adversarial samples generated by two methods

    对抗样本生成方法 Dr/个
    #1-LR #4-RF #7-SVM #10-MLP1 #13-MLP2 #16-MLP3
    MalGAN 28.31 23.26 102.87 24.70 35.14 29.45
    本文方法 9.56 11.62 43.12 15.47 13.39 13.92
    下载: 导出CSV
  • [1] ALAZAB M. Automated malware detection in mobile app stores based on robust feature generation[J]. Electronics, 2020, 9(3): 435. http://www.researchgate.net/publication/339768947_Automated_Malware_Detection_in_Mobile_App_Stores_Based_on_Robust_Feature_Generation
    [2] SAXE J, BERLIN K. Deep neural network based malware detection using two dimensional binary program features[C]//201510th International Conference on Malicious and Unwanted Software (MALWARE). Piscataway: IEEE Press, 2015: 11-20.
    [3] PASCANU R, STOKES J W, SANOSSIAN H, et al. Malware classification with recurrent networks[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2015: 1916-1920.
    [4] HUANG W Y, STOKES J W. MtNet: A multi-task neural network for dynamic malware classification[C]//International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment(DIMVA). Berlin: Springer, 2016: 399-418.
    [5] KOLOSNJAJI B, ZARRAS A, WEBSTER G, et al. Deep learning for classification of malware system call sequences[C]//Australasian Joint Conference on Artificial Intelligence. Berlin: Springer, 2016: 137-149.
    [6] SCHULTZ M G, ESKIN E, ZADOK F, et al. Data mining methods for detection of new malicious executables[C]//Proceedings 2001 IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2000: 38-49.
    [7] KOLTER J Z, MALOOF M A. Learning to detect malicious executables in the wild[C]//Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2004: 470-478.
    [8] KOLTER J Z, MALOOF M A. Learning to detect and classify malicious executables in the wild[J]. Journal of Machine Learning Research, 2006, 7(4): 2721-2744. http://www.vxheavens.com/lib/pdf/Learning%20to%20Detect%20and%20Classify%20Malicious%20Executables%20in%20the%20Wild.pdf
    [9] RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust You ": Explaining the predictions of any classifier[C]//Proceedings of the 2016 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1135-1144.
    [10] SU D, ZHANG H, CHEN H G, et al. Is robustness the cost of accuracy A comprehensive study on the robustness of 18 deep image classification models[C]//Computer Vision-ECCV 2018, 2018: 644-661.
    [11] STOKES J W, WANG D, MARINESCU M, et al. Attack and defense of dynamic analysis-based, adversarial neural malware detection models[C]//2018 IEEE Military Communications Conference (MILCOM). Piscataway: IEEE Press, 2018: 1-8.
    [12] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]//Proceeding of 2nd International Conference on Learning Representations(ICLR), 2014: 14-16.
    [13] GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[C]//Proceedings of 3rd International Conference on Learning Representations(ICLR), 2015: 7-9.
    [14] HU W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[EB/OL]. (2017-02-20)[2020-08-01]. https://arxiv.org/abs/1702.05983.
    [15] CHEN P Y, ZHANG H, SHARMA Y, et al. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models[C]//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. New York: ACM, 2017: 15-26.
    [16] DOBROVOLJC A, TREEK D, LIKAR B. Predicting exploitations of information systems vulnerabilities through attackers' characteristics[J]. IEEE Access, 2017, 5: 26063-26075. http://www.onacademic.com/detail/journal_1000040114566510_0535.html
    [17] CRANDALL J R, SU Z D, WU S F. On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits[C]//Proceedings of the 12th ACM Conference on Computer and Communications Security. New York: ACM, 2005: 235-248.
    [18] ÍNCER R Í, THEODORIDES M, AFROZ S, et al. Adversarially robust malware detection using monotonic classification[C]//Proceedings of the 4th ACM International Workshop on Security and Privacy Analytics. New York: ACM, 2018: 54-63.
    [19] XU W L, EVANS D, QI Y J. Feature squeezing: Detecting adversarial examples in deep neural networks[EB/OL]. (2017-12-05)[2020-08-01]. https://arxiv.org/abs/1704.01155.
    [20] PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//2016 IEEE European Symposium on Security and Privacy (EuroS&P). Piscataway: IEEE Press, 2016: 372-387.
    [21] KREUK F, BARAK A, AVIV-REUVEN S, et al. Adversarial examples on discrete sequences for beating whole-binary malware detection[EB/OL]. (2019-01-10)[2020-08-01]. https://arxiv.org/abs/1802.04528v1.
    [22] SWIESKOWSKI P, KUZINS S. Ninite-install or update multiple apps at once[EB/OL]. (2015-10-01)[2020-08-01]. https://ninite.com/.
    [23] RAFF E, BARKER J, SYLVESTER J, et al. Malware detection by eating a whole exe[C]//The Workshops of the Thirty-Second Conference on Artificial Intelligence (AAAI), 2018: 268-276.
    [24] GOLDBLOOM A, HAMNER B. Microsoft malware classification challenge (BIG 2015)[EB/OL]. (2015-12-15)[2020-08-01]. https://www.kaggle.com/c/malware-classification/data.
    [25] CARLINI N, ATHALYE A, PAPERNOT N, et al. On evaluating adversarial robustness[EB/OL]. (2019-02-20)[2020-08-01]. https://arxiv.org/abs/1902.06705.
  • 加载中
图(2) / 表(5)
计量
  • 文章访问数:  393
  • HTML全文浏览量:  40
  • PDF下载量:  52
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-09
  • 录用日期:  2020-10-23
  • 网络出版日期:  2022-02-20

目录

    /

    返回文章
    返回
    常见问答