Citation: | HAO Jingwei, PAN Limin, LI Rui, et al. Low redundancy feature selection method for Android malware detection[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(2): 225-232. doi: 10.13700/j.bh.1001-5965.2020.0567(in Chinese) |
A low redundancy feature selection method for Android malware detection is proposed to solve the problem of feature redundancy caused by excessive attention to features with the same frequency distribution between classes. First, the method selects features with frequency distribution bias by Mann-Whitney test, and then quantifies the degree of bias and feature appearance frequency by the appearance ratio interval algorithm to reject features with low bias and low use frequency in the overall software. Finally, the particle swarm optimization algorithm is combined with model detection effect to obtain the optimal feature subset. Experiments were conducted using public datasets DREBIN and AMD. The experimental results show that 294-dimensional features were selected on the AMD dataset, and the detection accuracy of the six classifiers is improved by 1%-5%, 295-dimensional features were selected on the DREBIN dataset less than 4 comparison methods, and the detection accuracy of the six classifiers is improved by 1.7%-5%. The experimental results illustrate that the proposed method can reduce the redundancy of features in Android malware detection and improve the malware detection accuracy.
[1] |
中国互联网络信息中心. 第44次中国互联网络发展现状统计报告[R]. 北京: 中国互联网络信息中心, 2019.
China Internet Network Information Center. The 44th China statistical reports on internet development[R]. Beijing: China Internet Network Information Center, 2019(in Chinese).
|
[2] |
International Data Corporation. Worldwide smartphone market shares[R]. New York: International Data Corporation, 2019.
|
[3] |
YERIMA S Y, SEZER S, MCWILLIAMS G. Analysis of Bayesian classification-based approaches for Android malware detection[J]. IET Information Security, 2014, 8(1): 25-26. doi: 10.1049/iet-ifs.2013.0095
|
[4] |
PEHLIVAN U, BALTACI N, ACARTURK C, et al. The analysis of feature selection methods and classification algorithms in permission based Android malware detection[C]//Computational Intelligence in Cyber Security. Piscataway: IEEE Press, 2014: 1-8.
|
[5] |
WANG W, WANG X, FENG D, et al. Exploring permission-induced risk in Android applications for malicious application detection[J]. IEEE Transactions on Information Forensics and Security, 2014, 9(11): 1869-1882. doi: 10.1109/TIFS.2014.2353996
|
[6] |
CEN L, GATES C S, SI L, el al. A probabilistic discriminative model for Android malware detection with decompiled source code[J]. IEEE Transactions on Dependable and Secure Computing, 2015, 12(4): 400-412. doi: 10.1109/TDSC.2014.2355839
|
[7] |
ZHAO K, ZHANG D, SU X, et al. Fest: A feature extraction and selection tool for Android malware detection[C]//Computers and Communication. Piscataway: IEEE Press, 2015: 714-720.
|
[8] |
TAO G, ZHENG Z, GUO Z, et al. MalPat: Mining patterns of malicious and benign Android apps via permission-related APIs[J]. IEEE Transactions on Reliability, 2018, 67(1): 355-369. doi: 10.1109/TR.2017.2778147
|
[9] |
LI J, SUN L, YAN Q, et al. Significant permission identification for machine learning based Android malware detection[J]. IEEE Transactions on Industrial Informatics, 2017, 14(7): 3216-3225.
|
[10] |
DESNOS A, GUEGUEN G, BACHMANN S. Androguard package[EB/OL]. (2020-04-30)[2021-09-01].
|
[11] |
MANN H B, WHITNEY D R. On a test whether one of two random variables is statistically larger than the other[J]. Annals of Mathematical Statistics, 1947, 18(1): 50-60. doi: 10.1214/aoms/1177730491
|
[12] |
FRIEDMAN J H. Stochastic gradient boosting[J]. Computational Statistics & Data Analysis, 2002, 38(4): 367-378.
|
[13] |
FREUND Y, SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Sciences, 1997, 55(1): 119-139. doi: 10.1006/jcss.1997.1504
|
[14] |
HANSEN L K, SALAMON P. Neural network ensembles[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(10): 993-1001. doi: 10.1109/34.58871
|
[15] |
RUCZINSKI I, KOOPERBERG C, LEBLANC M. Logic regression[J]. Journal of Computational and Graphical Statistics, 2003, 12(3): 475-511. doi: 10.1198/1061860032238
|
[16] |
FRIEDMAN N, GEIGER D, GOLDSZMIDT M. Bayesian network classifiers[J]. Machine Learning, 1997, 29: 131-163. doi: 10.1023/A:1007465528199
|
[17] |
BREIMAN L. Random forests[J]. Machine Learning, 2001, 45: 5-32. doi: 10.1023/A:1010933404324
|
[18] |
MORALES O S, ESCAMILLA A P J, RODRGUEZ M A, et al. Native malware detection in smartphones with Android OS using static analysis, feature selection and ensemble classifiers[C]//International Conference on Malicious and Unwanted Software. Piscataway: IEEE Press, 2016: 67-74.
|
[19] |
SEDANO J, GONZLEZ S, CHIRA C, et al. Key features for the characterization of Android malware families[J]. Logic Journal of the IGPL, 2017, 25(1): 54-66. doi: 10.1093/jigpal/jzw046
|
[20] |
RAI S, DHANESHA R, NAHATA S, et al. Malicious application detection on Android smartphones with enhanced static-dynamic analysis[C]//International Conference on Information Systems Security. Berlin: Springer, 2017: 194-208.
|
[21] |
FATIMA A, MAURYA R, DUTTA M K, et al. Android malware detection using genetic algorithm based optimized feature selection and machine learning[C]//International Conference on Telecommunications & Signal Processing. Piscataway: IEEE Press, 2019: 220-223.
|
[22] |
SUN L, LI Z, YAN Q, et al. SigPID: Significant permission identification for android malware detection[C]//International Conference on Malicious and Unwanted Software. Piscataway: IEEE Press, 2017: 1-8.
|
[1] | ZHENG J,HE Z H,YU X C. One-stage object detection based on adjacent feature fusion and feature decoupling[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1205-1214 (in Chinese). doi: 10.13700/j.bh.1001-5965.2023.0249. |
[2] | CHEN Y,WANG Z,ZHOU F C. Railway foreign objects tracking detection based on spatial location and feature generalization enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(1):9-18 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0974. |
[3] | WANG Z K,YAO W J,XUE S J,et al. Siamese network object tracking algorithm based on deep and shallow feature fusion[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(3):973-984 (in Chinese). doi: 10.13700/j.bh.1001-5965.2023.0137. |
[4] | WEN G,YUAN L F,WANG X D,et al. Loading optimization of irregular unit load device based on improved NSGA-Ⅱ algorithm[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(3):992-1004 (in Chinese). doi: 10.13700/j.bh.1001-5965.2023.0149. |
[5] | ZHANG J H,ZHAO W,WANG Z C,et al. UAV pedestrian tracking algorithm based on detection and re-identification[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(8):2538-2546 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0675. |
[6] | ZHANG Zhihao, DU Lixia, HAO Ziwei, HOU Yue. Multi-core contextual feature-guided algorithm for trusted detection of UAV aerial images[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0548 |
[7] | LIU M J,LUO J W,QIN S Y. 3D SLAM algorithm based on geometric constraints of feature points in dynamic scenarios[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(9):2872-2884 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0721. |
[8] | ZHONG J,LUO C,ZHANG H,et al. Flight data anomaly detection based on correlation parameter selection[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(5):1738-1745 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0574. |
[9] | ZHANG Wenfei, ZHANG Huawei, MEI Yuan, XIAO Nan, ZHU Qiudong, LIAN Jing. A DINO remote sensing target detection algorithm combining efficient hybrid encoder and structural reparameterization[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0320 |
[10] | CHEN Kai, HUANG Yujie, ZHAO Xiaodong, WANG Pengfei, CHEN Kai, LIN Yanze, LEI Yichen. Object Tracking Algorithm Based on Deep Feature Modification[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0196 |
[11] | ZHANG P,ZHOU Q X,YU H Q,et al. Fast detection method of mental fatigue based on EEG signal characteristics[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(1):145-154 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0211. |
[12] | ZHANG Y Z,LI W B,ZHENG T T. Inverted residual target detection algorithm based on LGC[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(6):1287-1293 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0452. |
[13] | MA Su-gang, DUAN Shuai-peng, HOU Zhi-qiang, YU Wang-sheng, PU Lei, YANG Xiao-bao. Multi-object tracking algorithm based on dual-branch feature enhancement and multi-level trajectory association[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0472 |
[14] | LIU Fang, YANG Yu-yan, WANG Xin. UAV tracking algorithm based on feature fusion and block attention[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0281 |
[15] | WANG Yang-guang, YAO Yuan-zhi, YU Neng-hai. Cover Selection Method for Batch Image Steganography with Multivariate Optimization[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0380 |
[16] | MENG Wei-jun, AN Wen, MA Su-gang, YANG Xiao-bao. An Object Detection Algorithm Based on Feature Enhancement and Adaptive Threshold Non-maximum Suppression[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0534 |
[17] | XING Z W,ZHANG L,LUO Q,et al. Causal analysis framework of flight service[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(9):2234-2243 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0679. |
[18] | CHEN C,ZHAO W. Remote sensing target detection based on dynanic feature selection[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(3):702-709 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0300. |
[19] | FAN Tao, SUN Tao, LIU Hu. Hot spot detection algorithm of photovoltaic module based on attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1304-1313. doi: 10.13700/j.bh.1001-5965.2021.0457 |
[20] | XIE Xiangying, LAI Guangzhi, NA Zhixiong, LUO Xin, WANG Dong. Occlusion recognition algorithm based on multi-resolution feature auto-selection[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1154-1163. doi: 10.13700/j.bh.1001-5965.2021.0289 |