Large-scale IoT malware analysis and classification method

HE Qinglin; WANG Lihong; LUO Bing; YANG Libin

doi:10.13700/j.bh.1001-5965.2020.0401

Volume 48 Issue 2

Feb. 2022

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2022 > 48(2): 240-248.

HE Qinglin, WANG Lihong, LUO Bing, et al. Large-scale IoT malware analysis and classification method[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 240-248. doi: 10.13700/j.bh.1001-5965.2020.0401(in Chinese)

Citation:

HE Qinglin, WANG Lihong, LUO Bing, et al. Large-scale IoT malware analysis and classification method[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 240-248. doi: 10.13700/j.bh.1001-5965.2020.0401(in Chinese)

Citation:

PDF( 2456 KB)

Large-scale IoT malware analysis and classification method

doi: 10.13700/j.bh.1001-5965.2020.0401

HE Qinglin^{1, 2},
WANG Lihong^{2
,
,},
LUO Bing¹,
YANG Libin³

1.
National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
2.
School of Computer Science and Engineering, Beihang University, Beijing 100083
3.
School of Cybersecurity, Northwestern Polytechnical University, Xi'an 710072, China

Funds:

National Key R & D Program of China 2017YFC1201204

More Information

Corresponding author: WANG Lihong, E-mail: wlh@isc.org.cn
Received Date: 09 Aug 2020
Accepted Date: 05 Sep 2020
Publish Date: 20 Feb 2022

Abstract

Abstract

Recently, Internet of things (IoT) malware emerges in large numbers and attacks IoT devices in cyberspace. However, the family characteristics of IoT malwares are not obvious due to the open-source problem, a more fine-grained malware classification method is needed to solve the problems of advanced threat malware discovery and attack organization tracking. To address this question, we took a large-scale analysis of 157 911 IoT malwares which have been found from May 2019 to May 2020, and labeled a dataset which includes 9 categories and 12 278 malwares. Then we proposed an IoT malware classification method whose main idea is extracting complex structure features including FCG graph and text by static reverse analysis. The learning features using graph representation learning and text representation learning were used, and the experiments on the labeled dataset show that the average recall rate is 88.1%. Our method has been taken into practice and works well.
- Internet of things (IoT),
- malware,
- classification,
- graph learning,
- text learning

FullText(HTML)

References(25)

References

[1]	World Economic Forum. The global risks report 2020[EB/OL]. (2020-01-15)[2020-07-03]. https://www.weforum.org/reports/the-global-risks-report-2020.
[2]	Gartner Inc. Gartner identifies top 10 strategic IoT technologies and trends[EB/OL]. (2018-11-07)[2020-07-03]. https://www.gartner.com/en/newsroom/press-releases/2018-11-07-gartner-identifies-top-10-strategic-iot-technologies-and-trends.
[3]	ANTONAKAKIS M, APRIL T, BAILEY M, et al. Understanding the Mirai botnet[C]//USENIX Security Symposium, 2017: 1093-1110.
[4]	DE DONNO M, DRAGONI N, GIARETTA A, et al. DDoS-capable IoT malwares: Comparative analysis and Mirai investigation[J]. Security and Communication Networks, 2018, 2018: 7178164.
[5]	COZZI E, GRAZIANO M, FRATANTONIO Y, et al. Understanding Linux malware[C]//IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2018: 161-175.
[6]	HERWIG S, HARVEY K, HUGHEY G, et al. Measurement and analysis of Hajime a peer-to-peer IoT botnet[C]//Network and Distributed Systems Security Symposium, 2019: 1-15.
[7]	国家互联网应急中心. Mozi样本分析报告[EB/OL]. (2020-02-28)[2020-07-03]. https://www.ics-cert.org.cn/portal/page/112/f6aa66554f9a4669904d6b138cfea1ac.html. CNCERT. Dive into Mozi malware[EB/OL]. (2020-02-28)[2020-07-03]. https://www.ics-cert.org.cn/portal/page/112/f6aa66554f9a4669904d6b138cfea1ac.html (in Chinese).
[8]	Google LLC. VirusTotal[EB/OL]. [2020-07-03]. http://virustotal.com.
[9]	SU J W, VARGAS D V, PRASAD S, et al. Lightweight classification of IoT malware based on image recognition[C]//IEEE 42nd Annual Computer Software and Application Conference. Piscataway: IEEE Press, 2018: 664-669.
[10]	GIBERT D, MATEU C, PLANES J, et al. Classification of malware by using structural entropy on convolutional neural networks[C]//30th AAAI Conference on Innovative Applications of Artificial Intelligence, 2018: 1-6.
[11]	SRI SHAILA G, DARKI A, FALOUTSOS M, et al. IDAPro for IoT malware analysis [C]//Proceedings of the 12th USENIX Conference on Cyber Security Experimentation and Test, 2019: 15.
[12]	WANG F, SHOSHITAISHVILI Y. Angr-The next generation of binary analysis[C]//2017 IEEE Cybersecurity Development. Piscataway: IEEE Press, 2017: 8-9.
[13]	Radare2[EB/OL]. [2020-07-03]. https://github.com/radareorg/radare2.
[14]	HU X, CHIUEH T, SHIN K G. Large-scale malware indexing using function-call graphs[C]//ACM Conference on Computer and Communications Security. New York: ACM, 2009: 611-620.
[15]	KONG D, YAN G H. Discriminant malware distance learning on structural information for automated malware classification[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2013: 1357-1365.
[16]	CIMPANU C. New Echobot malware is a smorgasbord of vulnerabilities[EB/OL]. (2019-06-17)[2020-07-03]. https://www.zdnet.com/article/new-echobot-malware-is-a-smorgasbord-of-vulnerabilities.
[17]	Microsoft malware classification challenge (BIG 2015)[EB/OL]. [2020-07-03]. https://www.kaggle.com/c/malware-classification.
[18]	HARUYAMA T. fn_fuzzy: Fast multiple binary diffing triage with IDA[EB/OL]. (2019-05-09)[2020-07-03]. https://conference.hitb.org/hitbsecconf2019ams/sessions/fn_fuzzy-fast-multiple-binary-diffing-triage-with-ida/.
[19]	XU X J, LIU C, FENG Q, et al. Neural network-based graph embedding for cross-platform binary code similarity detection[C]//ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2017: 363-376.
[20]	DEVLIN J, CHANG M, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT, 2019: 4171-4186.
[21]	HEITMAN C, ARCE I. BARF: A multiplatform open source binary analysis and reverse engineering framework[C]//XX Congreso Argentino de Ciencias de la Computación, 2014.
[22]	ALAM S, HORSPOOL R N, TRAORÉ I. MAIL: Malware analysis intermediate language: A step towards automating and optimizing malware detection[C]//Proceedings of the 6th International Conference on Security of Information and Networks, 2013: 233-240.
[23]	SHERVASHIDZE N, SCHWEITZER P, VAN LEEUWEN E J, et al. Weisfeiler-Lehman graph kernels[J]. Journal of Machine Learning Research, 2011, 12: 2539-2561. http://e-citations.ethbib.ethz.ch/view/pub:138403
[24]	MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of NIPS2013, 2013: 3111-3119.
[25]	LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]//Proceedings of ICML, 2014: 1188-1196.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(4) / Tables(6)

Get Citation

PDF

XML

Article Metrics

Article views(563) PDF downloads(76)

Large-scale IoT malware analysis and classification method

doi: 10.13700/j.bh.1001-5965.2020.0401

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Large-scale IoT malware analysis and classification method

doi: 10.13700/j.bh.1001-5965.2020.0401

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content