留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Android恶意APP多视角家族分类方法

郝靖伟 罗森林 张寒青 杨鹏 潘丽敏

郝靖伟, 罗森林, 张寒青, 等 . Android恶意APP多视角家族分类方法[J]. 北京航空航天大学学报, 2022, 48(5): 795-804. doi: 10.13700/j.bh.1001-5965.2020.0658
引用本文: 郝靖伟, 罗森林, 张寒青, 等 . Android恶意APP多视角家族分类方法[J]. 北京航空航天大学学报, 2022, 48(5): 795-804. doi: 10.13700/j.bh.1001-5965.2020.0658
HAO Jingwei, LUO Senlin, ZHANG Hanqing, et al. Android malicious APP multi-view family classification method[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 795-804. doi: 10.13700/j.bh.1001-5965.2020.0658(in Chinese)
Citation: HAO Jingwei, LUO Senlin, ZHANG Hanqing, et al. Android malicious APP multi-view family classification method[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 795-804. doi: 10.13700/j.bh.1001-5965.2020.0658(in Chinese)

Android恶意APP多视角家族分类方法

doi: 10.13700/j.bh.1001-5965.2020.0658
基金项目: 

国家242信息安全计划 2019A012

工信部2020年信息安全软件项目 CEIEC-2020-ZM02-0134

详细信息
    通讯作者:

    杨鹏, E-mail: yp@cert.org.cn

  • 中图分类号: V219; TP317

Android malicious APP multi-view family classification method

Funds: 

242 National Information Security Projects 2019A012

2020 Information Security Software Project of the Ministry of Industry and Information Technology CEIEC-2020-ZM02-0134

More Information
  • 摘要:

    针对现有Android恶意软件家族分类方法特征构建完备性不足、构建视角单质化等问题,提出了一种多视角特征规整的卷积神经网络(CNN)恶意APP家族分类方法。该方法结合MinHash算法。将软件中Android框架系统API、操作码序列、AndroidManifest.xml文件中的权限和Intent组合3个视角的原始特征在保留APP间相似度情况下进行规整,并利用多路卷积神经网络完成对各视图的特征提取和信息融合,构建一套恶意APP家族分类模型。基于公开数据集Drebin、Genome、AMD的实验结果表明:恶意APP家族分类准确率超过0.96,证明了所提方法能够充分挖掘各视角的行为特征信息,能有效利用多视角特征间的异构特性,具有较强的实用价值。

     

  • 图 1  多视角特征规整的CNN Android恶意软件家族分类原理框架

    Figure 1.  Principle framework of Android malware family classification method based on multi-view features regularization and convolutional neural network

    图 2  plankton家族样本3视角特征可视化效果

    Figure 2.  Visualization results of 3-view features of plankton family sample

    图 3  多视角卷积神经网络结构

    Figure 3.  Multi-view CNN structure

    图 4  OP视角卷积神经网络结构

    Figure 4.  OP view CNN structure

    图 5  API视角及MF视角卷积神经网络结构

    Figure 5.  API view and MF view CNN structure

    图 6  不同模型在不同数据划分方式下的分类准确率

    Figure 6.  Classification accuracy of different models under different data partitioning methods

    表  1  系统权限和系统自定义Intent信息

    Table  1.   System permissions and system customized Intent information

    元素 数目 样例
    系统权限 95 android.permission.ACCESS_NETWORK_STATEandroid.permission.CAMERA
    android.permission.ADD_SYSTEM_SERVICE
    android.permission.WRITE_CONTACTS
    android.permission.REBOOT
    系统定义的Intent 85 android.intent.category.BROWSABLE
    com.android.settings.APPLICATION_SETTINGS
    com.android.settings.WIFI_IP_SETTINGS
    android.intent.action.CALL
    android.intent.category.CAR_DOCK
    下载: 导出CSV

    表  2  实验软硬件环境信息概况

    Table  2.   Overview of experimental software and hardware environment information

    项目 配置
    Dell服务器 Inter(R) Xeon(R) Gold 5 120 CPU 2.20 GHz,GPU Tesla T4×4,Ubuntu 16.04 64 64
    开发工具 Pytorch1.3.0,python3.5,androguard3.3.5
    下载: 导出CSV

    表  3  神经网络参数设置信息

    Table  3.   Neural network parameter setting information

    设置项 信息
    优化算法 AdamW
    初始学习率 0.001
    epoch 100
    batch_size 128
    下载: 导出CSV

    表  4  恶意家族分类消融测试实验结果

    Table  4.   Experimental results of classified ablation of malicious family

    训练测试集比 评价指标 API OP MF API+OP API+MF OP+MF API+OP+MF
    1∶10 Acc 0.908 0.877 0.862 0.914 0.914 0.864 0.920
    Pweight 0.910 0.876 0.860 0.915 0.917 0.867 0.921
    Rweight 0.908 0.877 0.862 0.914 0.914 0.864 0.920
    Fweight 0.907 0.872 0.854 0.908 0.908 0.853 0.914
    1∶5 Acc 0.932 0.922 0.877 0.948 0.945 0.910 0.947
    Pweight 0.928 0.922 0.877 0.947 0.943 0.910 0.946
    Rweight 0.932 0.922 0.870 0.948 0.945 0.910 0.947
    Fweight 0.928 0.917 0.873 0.946 0.942 0.907 0.944
    1∶2 Acc 0.963 0.941 0.918 0.962 0.965 0.928 0.964
    Pweight 0.957 0.94 0.912 0.963 0.963 0.927 0.963
    Rweight 0.963 0.941 0.918 0.962 0.965 0.928 0.964
    Fweight 0.959 0.939 0.913 0.961 0.962 0.924 0.961
    5∶1 Acc 0.979 0.964 0.956 0.986 0.984 0.977 0.990
    Pweight 0.974 0.966 0.951 0.986 0.985 0.977 0.990
    Rweight 0.979 0.964 0.956 0.986 0.984 0.977 0.990
    Fweight 0.976 0.964 0.952 0.985 0.984 0.976 0.989
    下载: 导出CSV

    表  5  恶意家族样本数据

    Table  5.   Malicious family sample data

    数据库 家族类型数量 软件数量
    Drebin 130 5 347
    Genome 33 1 185
    AMD 42 5 065
    下载: 导出CSV

    表  6  基于不同数据库的测试结果

    Table  6.   Test results based on different databases

    数据库 Acc Pweight Rweight Fweight
    Genome 0.982 0.987 0.982 0.982
    Drebin 0.965 0.963 0.965 0.962
    AMD 0.976 0.970 0.962 0.978
    下载: 导出CSV

    表  7  对比实验结果

    Table  7.   Comparative experimental results

    方法 数据库 Acc
    Dendroid Genome 0.942
    Apposcopy Genome 0.900
    DroidSIFT Genome 0.930
    MudFlow Genome 0.881
    TriFlow Genome 0.881
    DroidLegacy Genome 0.929
    Astroid Genome 0.938
    Astroid AMD 0.943
    FalDroid Genome 0.972
    FalDroid Drebin 0.953
    本文 Genome 0.982
    本文 Drebin 0.965
    本文 AMD 0.976
    下载: 导出CSV
  • [1] SCHULTZ M G, ESKIN E, ZADOK E, et al. Data mining methods for detection of new malicious executables[C]//Proceedings 2001 IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2000: 38-49.
    [2] ABOU-ASSALEH T, CERCONE N, KESELJ V, et al. Detection of new malicious code using N-grams signatures[C]// Second Annual Conference on Privacy Security and Trust. Piscataway: IEEE Press, 2004: 193-196.
    [3] PARK Y H, REEVES D S, STAMP M. Deriving common malware behavior through graph clustering[J]. Computers & Security, 2013, 39: 419-430.
    [4] SHEEN S, KARTHIK R, ANITHA R. Comparative study of two-and multi-classification-based detection of malicious executables using soft computing techniques on exhaustive feature set[M]//KRISHNAN G S S, ANITHA R, LEKSHMI R S, et al. Computational intelligence, cyber security and computational models. Berlin: Springer, 2014: 215-225.
    [5] SUAREZ-TANGIL G, TAPIADOR J E, PERISLOPEZ P, et al. Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families[J]. Expert Systems with Applications, 2014, 41(4): 1104-1117. doi: 10.1016/j.eswa.2013.07.106
    [6] FAN M, LIU J, LUO X P, et al. Android malware familial classification and representative sample selection via frequent subgraph analysis[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(8): 1890-1905. doi: 10.1109/TIFS.2018.2806891
    [7] JOSHUA G, MAHMOUD H, MALEK S. Lightweight, obfuscation-resilient detection and family identification of Android malware[C]//IEEE/ACM 40th International Conference on Software Engineering. Piscataway: IEEE Press, 2018: 497-497.
    [8] ZHANG L, THING V, CHENG Y. A scalable and extensible framework for Android malware detection and family attribution[J]. Computers & Security, 2019, 80: 120-133.
    [9] PEKTAS A, ACARMAN T. Deep learning for effective Android malware detection using API call graph embeddings[J]. Soft Computing, 2020, 24(2): 1027-1043. doi: 10.1007/s00500-019-03940-5
    [10] GAO T C, PENG W, SISODIA D, et al. Android malware detection via graphlet sampling[J]. IEEE Transactions on Mobile Computing, 2019, 18(12): 2754-2767. doi: 10.1109/TMC.2018.2880731
    [11] ZHANG M, DUAN Y, YIN H, et al. Semantics-aware Android malware classification using weighted contextual API dependency graphs[C]//Proceedings of the 2014 Conference on Computer and Communications Security. New York: ACM, 2014: 1105-1116.
    [12] AAFER Y, DU W, YIN H. DroidAPIMiner: Mining API-level features for robust malware detection in Android[C]//International Conference on Security and Privacy in Communication Systems. Berlin: Springer, 2013: 86-103.
    [13] CAI H, MENG N, RYDER B, et al. DroidCat: Effective Android malware detection and categorization via APP-level profiling[J]. IEEE Transactions on Information Forensics and Security, 2019, 14(6): 1455-1470. doi: 10.1109/TIFS.2018.2879302
    [14] SUN G, QIAN Q. Deep learning and visualization for identifying malware families[J]. IEEE Transactions on Dependable and Secure Computing, 2021, 18(1): 283-295. doi: 10.1109/TDSC.2018.2884928
    [15] ARP D, SPREITZENBARTH M, HUBNER M. Drebin: Effective and explainable detection of Android malware in your pocket[C]//21st Annual Network and Distributed System Security Symposium, 2014: 23-26.
    [16] ZHOU Y, JIANG X. Dissecting Android malware: Characterization and evolution[C]//Proceedings of the 2012 IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2012: 95-109.
    [17] LI Y, JANG J, HU X, et al. Android malware clustering through malicious payload mining[C]//International Symposium on Research in Attacks. Berlin: Springer, 2017: 192-214.
  • 加载中
图(6) / 表(7)
计量
  • 文章访问数:  452
  • HTML全文浏览量:  184
  • PDF下载量:  161
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-11-25
  • 录用日期:  2020-12-25
  • 网络出版日期:  2022-05-20

目录

    /

    返回文章
    返回
    常见问答