北京航空航天大学学报 ›› 2022, Vol. 48 ›› Issue (5): 795-804.doi: 10.13700/j.bh.1001-5965.2020.0658

• 论文 • 上一篇    下一篇

Android恶意APP多视角家族分类方法

郝靖伟1, 罗森林1, 张寒青1, 杨鹏2, 潘丽敏1   

  1. 1. 北京理工大学 信息与电子学院, 北京 100081;
    2. 国家计算机网络应急技术处理协调中心, 北京 100029
  • 收稿日期:2020-11-25 发布日期:2022-05-30
  • 通讯作者: 杨鹏 E-mail:yp@cert.org.cn
  • 基金资助:
    国家242信息安全计划(2019A012);工信部2020年信息安全软件项目(CEIEC-2020-ZM02-0134)

Android malicious APP multi-view family classification method

HAO Jingwei1, LUO Senlin1, ZHANG Hanqing1, YANG Peng2, PAN Limin1   

  1. 1. School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;
    2. National Computer Network Emergency Response Technical Team and Coordination Center, Beijing 100029, China
  • Received:2020-11-25 Published:2022-05-30
  • Supported by:
    242 National Information Security Projects (2019A012); 2020 Information Security Software Project of the Ministry of Industry and Information Technology (CEIEC-2020-ZM02-0134)

摘要: 针对现有Android恶意软件家族分类方法特征构建完备性不足、构建视角单质化等问题,提出了一种多视角特征规整的卷积神经网络(CNN)恶意APP家族分类方法。该方法结合MinHash算法。将软件中Android框架系统API、操作码序列、AndroidManifest.xml文件中的权限和Intent组合3个视角的原始特征在保留APP间相似度情况下进行规整,并利用多路卷积神经网络完成对各视图的特征提取和信息融合,构建一套恶意APP家族分类模型。基于公开数据集Drebin、Genome、AMD的实验结果表明:恶意APP家族分类准确率超过0.96,证明了所提方法能够充分挖掘各视角的行为特征信息,能有效利用多视角特征间的异构特性,具有较强的实用价值。

关键词: Android恶意软件, 家族分类, 多视角特征, 行为语义, 卷积神经网络(CNN)

Abstract: Aimed at the problems of incompleteness and singularization of feature construction in the existing Android malware family classification methods, a malicious APP family classification method based on multi-view features regularization and convolutional neural network (CNN) is proposed. We combine the MiniHash algorithm to visualize the original features of the three perspectives which contain APIs of Android framework, opcode sequences, and permissions and Intents in AndroidManifest.xml file, while retaining the similarity among APPs. The feature extraction and information fusion of each view are accomplished through a multi-view convolutional neural network, and then build a set of malicious APP family classification models. The experimental results based on Drebin, Genome and AMD public datasets show that the classification accuracy of malicious APP family is over 0.96, which proves that the proposed method can fully exploit the behavioral characteristic information of various perspectives and effectively make use of the heterogeneous characteristics among multiple perspectives, which has strong practical value.

Key words: Android malware, family classification, multi-view features, behavioral semantics, convolutional neural network (CNN)

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发