Volume 48 Issue 2
Feb.  2022
Turn off MathJax
Article Contents
LIU Yashu, HOU Yueran, YAN Hanbinget al. Malicious code detection based on heterogeneous information network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 258-265. doi: 10.13700/j.bh.1001-5965.2020.0539(in Chinese)
Citation: LIU Yashu, HOU Yueran, YAN Hanbinget al. Malicious code detection based on heterogeneous information network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 258-265. doi: 10.13700/j.bh.1001-5965.2020.0539(in Chinese)

Malicious code detection based on heterogeneous information network

doi: 10.13700/j.bh.1001-5965.2020.0539
Funds:

National Key R & D Program of China 2018YFB0803604

National Key R & D Program of China 2018YFB0804704

National Natural Science Foundation of China U1736218

the Fundamental Research Funds for Beijing University of Civil Engineering and Architecture X20152

More Information
  • Corresponding author: YAN Hanbing, E-mail: yhb@cert.org.cn
  • Received Date: 23 Sep 2020
  • Accepted Date: 18 Dec 2020
  • Publish Date: 20 Feb 2022
  • Malicious codes poses serious threats to network and information security. How to detect malware rapidly and how to eliminate and reduce the hazard caused by malware are important research topics. The paper presents a method to get dynamic features of malware using dynamic information and heterogeneous information network (HIN), and implements malicious codes detection and classification. Four meta graph schemes about FILE, API and DLL are proposed and malicious code HIN network pattern is described. An improved random walk strategy is used to obtain the context information of the object nodes in the meta graph schemes, which is used as the input of continuous bag of words (CBOW) model in order to get network embedding of word vectors. The method of principal angle is improved by voting to get the classification result of multiple meta graph schemes with feature fusion. The proposed method greatly improves the classification accuracy of malware based on the features of each meta graph when limited information is available.

     

  • loading
  • [1]
    SIKORSKI M, HONING A. Practical malware analysis: The hands on guide to dissecting malicious software[M]. San Francisco: No Starch Press, 2012: 1-2.
    [2]
    石川, 孙怡舟, 菲利普·俞. 异质信息网络的研究现状和未来发展[J]. 中国计算机学会通讯, 2017, 13(11): 35-40.

    SHI C, SUN Y Z, YU P. The research status and future development of heterogeneous information network[J]. Journal of China Computer Federation, 2017, 13(11): 35-40(in Chinese).
    [3]
    SHI C, LI Y, ZHANG J, et al. A survey of heterogeneous information network analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1): 17-37.
    [4]
    SUN Y Z, HAN J W, YAN X F, et al. PathSim: Meta path-based top-k similarity search in heterogeneous information networks[C]//Proceedings of the 37th International Conference on Very Large Data Bases, 2011: 992-1003.
    [5]
    SUN Y Z, NORICK B, HAN J, et al. Integrating meta path selection with user-guided object clustering in heterogeneous information networks[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012: 1348-1356.
    [6]
    SHI C, KONG X, HUANG Y, et al. HeteSim: A general framework for relevance measure in heterogeneous networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(10): 2479-2492.
    [7]
    CAO B, KONG X, YU P S. Collective prediction of multiple types of links in heterogeneous information networks[C]//Proceedings of the IEEE International Conference on Data Mining. Piscataway: IEEE Press, 2015: 50-59.
    [8]
    HUANG Z, ZHENG Y, CHENG R, et al. Meta structure: Computing relevance in large heterogeneous information networks[C]//Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1595-1604.
    [9]
    TAMERSOY A, ROUNDY K, CHAU D H. Guilt by association: Large scale malware detection by mining file-relation graphs[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 1524-1533.
    [10]
    CHEN L W, LI T, ABDULHAYOGLU M, et al. Intelligent malware detection based on file relation graphs[C]//Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing. Piscataway: IEEE Press, 2015: 85-92.
    [11]
    FAN Y J, HOU S F, ZHANG Y M, et al. Gotcha-Sly malware! Scorpion: A Metagraph2vec based malware detection system[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2018: 253-262.
    [12]
    PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710.
    [13]
    YU X D, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 135-144.
    [14]
    MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07)[2020-09-01]. https://arxiv.org/abs/1301.3781v3.
    [15]
    MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013, 2: 3111-3119.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(6)

    Article Metrics

    Article views(508) PDF downloads(28) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return