Adversarial sample generation technology of malicious code based on LIME

HUANG Tianbo; LI Chengyang; LIU Yongzhi; LI Denghui; WEN Weiping

doi:10.13700/j.bh.1001-5965.2020.0397

Volume 48 Issue 2

Feb. 2022

Turn off MathJax

Article Contents

Abstract

References

Journal of Beijing University of Aeronautics and Astronautics > 2022 > 48(2): 331-338.

PENG Lin-ping. Bifurcation of a Quadratic Integrable System under Quadratic Conservative Perturbations[J]. Journal of Beijing University of Aeronautics and Astronautics, 2000, 26(2): 235-238. (in Chinese)

Citation:

HUANG Tianbo, LI Chengyang, LIU Yongzhi, et al. Adversarial sample generation technology of malicious code based on LIME[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 331-338. doi: 10.13700/j.bh.1001-5965.2020.0397(in Chinese)

Citation:

PDF( 2026 KB)

Adversarial sample generation technology of malicious code based on LIME

doi: 10.13700/j.bh.1001-5965.2020.0397

School of Software & Microelectronics, Peking University, Beijing 102600, China

Funds:

National Natural Science Foundation of China 61872011

More Information

Corresponding author: WEN Weiping, E-mail: weipingwen@ss.pku.edu.cn
Received Date: 09 Aug 2020
Accepted Date: 23 Oct 2020
Publish Date: 20 Feb 2022

Abstract

Abstract

Based on the research and analysis of machine learning technology to detect malicious code, a local interpretable model-agnostic explanations (LIME)-based black-box adversarial examples generation method is proposed to generate adversarial samples for any black-box malicious code classifier and bypass the detection of machine learning models. The method uses a simple model to simulate the target classifier's local performances, obtains the feature weights, and generates disturbances through the disturbance algorithm. According to the generated disturbances, the method modifies the original malicious code to generate adversarial samples. We test the method using Microsoft's common malicious sample data in 2015 and the collected benign sample data from more than 50 suppliers as follows: 18 target classifiers based on different algorithms or features were implemented concerning common malicious code classifiers. Their classifiers' true positive rates were reduced to approximately zero when we attacked them using the method. Two advanced black-box sample generation methods, MalGAN and ZOO, were reproduced for comparison with this method. The experimental results show that the proposed method in this paper can effectively generate adversarial samples, and the method itself owns various strengths, including broad applicability, flexible control of disturbances, and soundness.
- adversarial samples,
- malicious code,
- machine learning,
- local interpretable model-agnostic explanations (LIME),
- target classifiers

FullText(HTML)

References(25)

References

[1]	ALAZAB M. Automated malware detection in mobile app stores based on robust feature generation[J]. Electronics, 2020, 9(3): 435.
[2]	SAXE J, BERLIN K. Deep neural network based malware detection using two dimensional binary program features[C]//201510th International Conference on Malicious and Unwanted Software (MALWARE). Piscataway: IEEE Press, 2015: 11-20.
[3]	PASCANU R, STOKES J W, SANOSSIAN H, et al. Malware classification with recurrent networks[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2015: 1916-1920.
[4]	HUANG W Y, STOKES J W. MtNet: A multi-task neural network for dynamic malware classification[C]//International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment(DIMVA). Berlin: Springer, 2016: 399-418.
[5]	KOLOSNJAJI B, ZARRAS A, WEBSTER G, et al. Deep learning for classification of malware system call sequences[C]//Australasian Joint Conference on Artificial Intelligence. Berlin: Springer, 2016: 137-149.
[6]	SCHULTZ M G, ESKIN E, ZADOK F, et al. Data mining methods for detection of new malicious executables[C]//Proceedings 2001 IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2000: 38-49.
[7]	KOLTER J Z, MALOOF M A. Learning to detect malicious executables in the wild[C]//Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2004: 470-478.
[8]	KOLTER J Z, MALOOF M A. Learning to detect and classify malicious executables in the wild[J]. Journal of Machine Learning Research, 2006, 7(4): 2721-2744.
[9]	RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust You ": Explaining the predictions of any classifier[C]//Proceedings of the 2016 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1135-1144.
[10]	SU D, ZHANG H, CHEN H G, et al. Is robustness the cost of accuracy A comprehensive study on the robustness of 18 deep image classification models[C]//Computer Vision-ECCV 2018, 2018: 644-661.
[11]	STOKES J W, WANG D, MARINESCU M, et al. Attack and defense of dynamic analysis-based, adversarial neural malware detection models[C]//2018 IEEE Military Communications Conference (MILCOM). Piscataway: IEEE Press, 2018: 1-8.
[12]	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]//Proceeding of 2nd International Conference on Learning Representations(ICLR), 2014: 14-16.
[13]	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[C]//Proceedings of 3rd International Conference on Learning Representations(ICLR), 2015: 7-9.
[14]	HU W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[EB/OL]. (2017-02-20)[2020-08-01].
[15]	CHEN P Y, ZHANG H, SHARMA Y, et al. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models[C]//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. New York: ACM, 2017: 15-26.
[16]	DOBROVOLJC A, TREEK D, LIKAR B. Predicting exploitations of information systems vulnerabilities through attackers' characteristics[J]. IEEE Access, 2017, 5: 26063-26075.
[17]	CRANDALL J R, SU Z D, WU S F. On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits[C]//Proceedings of the 12th ACM Conference on Computer and Communications Security. New York: ACM, 2005: 235-248.
[18]	ÍNCER R Í, THEODORIDES M, AFROZ S, et al. Adversarially robust malware detection using monotonic classification[C]//Proceedings of the 4th ACM International Workshop on Security and Privacy Analytics. New York: ACM, 2018: 54-63.
[19]	XU W L, EVANS D, QI Y J. Feature squeezing: Detecting adversarial examples in deep neural networks[EB/OL]. (2017-12-05)[2020-08-01].
[20]	PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//2016 IEEE European Symposium on Security and Privacy (EuroS&P). Piscataway: IEEE Press, 2016: 372-387.
[21]	KREUK F, BARAK A, AVIV-REUVEN S, et al. Adversarial examples on discrete sequences for beating whole-binary malware detection[EB/OL]. (2019-01-10)[2020-08-01].
[22]	SWIESKOWSKI P, KUZINS S. Ninite-install or update multiple apps at once[EB/OL]. (2015-10-01)[2020-08-01].
[23]	RAFF E, BARKER J, SYLVESTER J, et al. Malware detection by eating a whole exe[C]//The Workshops of the Thirty-Second Conference on Artificial Intelligence (AAAI), 2018: 268-276.
[24]	GOLDBLOOM A, HAMNER B. Microsoft malware classification challenge (BIG 2015)[EB/OL]. (2015-12-15)[2020-08-01].
[25]	CARLINI N, ATHALYE A, PAPERNOT N, et al. On evaluating adversarial robustness[EB/OL]. (2019-02-20)[2020-08-01].