-
摘要:
威胁情报关联分析已成为网络攻击溯源的有效方式。从公开威胁情报源爬取了不同高级持续性威胁(APT)组织的威胁情报分析报告,并提出一种基于图注意力机制的威胁情报报告归类的方法,目的是检测新产生的威胁情报分析报告类别是否为已知的攻击组织,从而有助于进一步的专家分析。通过设计威胁情报知识图谱,提取战术和技术情报,对恶意样本、IP和域名进行属性挖掘,构建复杂网络,使用图注意力神经网络进行威胁情报报告节点分类。评估表明:所提方法在考虑类别分布不均衡的情况下,可以达到78%的准确率,达到对威胁情报报告所属组织进行有效判定的目的。
Abstract:Threat intelligence correlation analysis has become an effective way to trace the source of cyber attacks. The threat intelligence analysis reports of different advanced persistent threat (APT) organizations were crawled from the public threat intelligence sources, and a threat intelligence report classification method based on graph attention mechanism was proposed, which was to detect whether the newly generated threat intelligence analysis report categories were known attack organizations, so as to facilitate further expert analysis. By designing a threat intelligence knowledge graph, extracting tactical and technical intelligence, mining the attributes of malicious samples, IPs and domain names, constructing a complex network, and using the graph attention neural network to classify the threat intelligence reporting nodes. Evaluation indicates that the method can achieve an accuracy rate of 78% while considering the uneven distribution of categories, which can effectively achieve the purpose of judging the organization to which the threat intelligence report belongs.
-
表 1 部分IOC正则表达式
Table 1. Some examples of IOC regular expressions
实体类型 正则表达式 MD5 [a-f 0-9]{32}|[A-F 0-9]{32} SHA1 [a-f 0-9]{40}|[A-F 0-9]{40} SHA256 [a-f0-9]{64}|[A-F0-9]{64} CVE CVE-[0-9]{4}-[0-9]{4,6} IP ((25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d)))\.){3}
(25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d)))Domain [a-zA-Z0-9][-a-zA-Z0-9]{0,62}\.+?)([a-zA-Z][-a-zA-Z]{0,62} 表 2 APT部分组织别名
Table 2. APT part organization aliases
APT组织 别名 Sofacy APT28, PawnStorm, PawnStorm, FancyBear,Sednit, SNAKEMACKEREL, TsarTeam, TsarTeam, TG-4127, Group-4127, STRONTIUM, TAG_0700, Swallowtail,IRONTWILIGHT, Group74, SIG40, GrizzlySteppe, apt_sofacy BITTER T-APT-17, APT-C-08, 蔓灵花 APT32 海莲花、OceanLotusGroup, OceanLotus, CobaltKitty, APT-C-00, SeaLotus, SeaLotus, APT-32, OceanBuffalo, PONDLOACH, TINWOODLAWN Confucius 孔夫子 SideWinder 响尾蛇 表 3 本文使用的威胁情报源
Table 3. Threat intelligence sources used in this paper
威胁情报源 厂商名称 国内安全厂商 绿盟,奇安信,360,微步在线,安天 国外安全厂商 MITRE ATT&CK,CheckPoint,CrowdStrike,Microsoft,Trend Micro,Symantec,FireEye,Kaspersky,Welivesecurity, Malwarebytes, Mandiant,Ahnlab,VirusTotal Github开源项目 APTnotes 表 4 模型超参数设置
Table 4. Super parameter settings of model
参数 数值 seed 72 weight_decay 5×104 nb_head 32 α 0.2 lr 0.005 hidden 8 dropout 0.3 patience 200 表 5 性能评估指标
Table 5. Performance evaluation indicators
组织 P R F1 报告数量 APT29 0.70 0.89 0.78 18 APT32 0.80 0.67 0.73 24 APT33 0.20 0.25 0.22 4 APT34 0.90 0.76 0.83 25 APT37 0.75 0.60 0.67 5 BITTER 0.57 0.50 0.53 8 Cobalt 1.00 0.33 0.50 6 Confucius 0.90 0.82 0.86 11 DarkHotel 0.15 0.67 0.25 3 FIN6 0.60 0.75 0.67 4 FIN7 0.57 0.50 0.53 16 Kimsuky 0.29 0.20 0.24 10 Lazarus 0.77 0.90 0.83 106 MuddyWater 0.88 0.84 0.86 45 ProjectSauron 0.80 0.80 0.80 10 Shammon 1.00 0.75 0.86 8 SideWinder 0.67 0.50 0.57 8 Sofacy 0.84 0.90 0.87 41 StrongPity 1.00 0.85 0.92 46 TeamTNT 1.00 0.14 0.25 7 PROMETHIUM 0.83 1.00 0.91 5 TA505 0.79 0.82 0.81 33 Accuracy 0.78 443 Macro avg 0.73 0.66 0.66 443 Micro avg 0.78 0.78 0.78 443 Weighted avg 0.80 0.78 0.78 443 表 6 方法对比
Table 6. Methods comparison
方法 P R F1 GCN 0.744 0.744 0.744 GraphSage 0.752 0.748 0.749 GAT 0.780 0.780 0.780 -
[1] 尹彦, 张红斌, 刘滨, 等. 网络安全态势感知中的威胁情报技术[J]. 河北科技大学学报, 2021, 42(2): 195-204. doi: 10.7535/hbkd.2021yx02012YIN Y, ZHANG H B, LIU B, et al. Threat intelligence technology in network security situation awareness[J]. Journal of Hebei University of Science and Technology, 2021, 42(2): 195-204(in Chinese). doi: 10.7535/hbkd.2021yx02012 [2] 王淮, 杨天长. 网络威胁情报关联分析技术[J]. 信息技术, 2021, 45(2): 26-32.WANG H, YANG T C. Network threat intelligence correlation analysis technology[J]. Information Technology, 2021, 45(2): 26-32(in Chinese). [3] 赵宁, 李蕾, 刘青春, 等. 基于网络开源情报的威胁情报分析与管理[J]. 情报杂志, 2021, 40(11): 16-22. doi: 10.3969/j.issn.1002-1965.2021.11.003ZHAO N, LI L, LIU Q C, et al. Analysis and management of threat intelligence based on OSINT[J]. Journal of Intelligence, 2021, 40(11): 16-22(in Chinese). doi: 10.3969/j.issn.1002-1965.2021.11.003 [4] 党超辉, 马志伟, 邵国飞, 等. 基于大数据与威胁情报的防御体系研究[J]. 计算机与网络, 2021, 47(15): 46-47. doi: 10.3969/j.issn.1008-1739.2021.15.043DANG C H, MA Z W, SHAO G F, et al. Research on defense system based on big data and threat intelligence[J]. Computer & Network, 2021, 47(15): 46-47(in Chinese). doi: 10.3969/j.issn.1008-1739.2021.15.043 [5] 何志鹏, 刘鹏, 王鹤. 网络威胁情报标准化建设分析[J]. 信息安全研究, 2021, 7(6): 503-511. doi: 10.3969/j.issn.2096-1057.2021.06.004HE Z P, LIU P, WANG H. Analysis on standardization construction of cyber threat intelligence[J]. Journal of Information Security Research, 2021, 7(6): 503-511(in Chinese). doi: 10.3969/j.issn.2096-1057.2021.06.004 [6] 张红斌, 尹彦, 赵冬梅, 等. 基于威胁情报的网络安全态势感知模型[J]. 通信学报, 2021, 42(6): 182-194.ZHANG H B, YIN Y, ZHAO D M, et al. Network security situational awareness model based on threat intelligence[J]. Journal on Communications, 2021, 42(6): 182-194(in Chinese). [7] 赵阳. 绿盟威胁情报解决方案[J]. 信息安全与通信保密, 2020, 18(S1): 23-28. doi: 10.3969/j.issn.1009-8054.2020.z1.006ZHAO Y. Green alliance threat intelligence solution[J]. Information Security and Communications Privacy, 2020, 18(S1): 23-28(in Chinese). doi: 10.3969/j.issn.1009-8054.2020.z1.006 [8] VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2017-10-30)[2022-07-12]. [9] 黄克振, 连一峰, 冯登国, 等. 一种基于图模型的网络攻击溯源方法[J]. 软件学报, 2022, 33(2): 683-698.HUANG K Z, LIAN Y F, FENG D G, et al. Method of cyber attack attribution based on graph model[J]. Journal of Software, 2022, 33(2): 683-698(in Chinese). [10] 蔡扬. 基于网络行为分析的单机木马检测技术的研究与实现[D]. 杭州: 浙江大学, 2016.CAI Y. Research and implementation of Trojan detection technology base on network behavior analysis[D]. Hangzhou: Zhejiang University, 2016(in Chinese). [11] PERRY L, SHAPIRA B, PUZIS R. NO-DOUBT: Attack attribution based on threat intelligence reports[C]//Proceedings of the IEEE International Conference on Intelligence and Security Informatics. Piscataway: IEEE Press, 2019: 80-85. [12] NAVEEN S, PUZIS R, ANGAPPAN K. Deep learning for threat actor attribution from threat reports[C]//Proceedings of the International Conference on Computer, Communication and Signal Processing. Piscataway: IEEE Press, 2020: 1-6. [13] 丁兆云, 刘凯, 刘斌, 等. 网络安全知识图谱研究综述[J]. 华中科技大学学报(自然科学版), 2021, 49(7): 79-91.DING Z Y, LIU K, LIU B, et al. Survey of cyber security knowledge graph[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49(7): 79-91(in Chinese). [14] 杨沛安, 刘宝旭, 杜翔宇. 面向攻击识别的威胁情报画像分析[J]. 计算机工程, 2020, 46(1): 136-143.YANG P A, LIU B X, DU X Y. Portrait analysis of threat intelligence for attack recognition[J]. Computer Engineering, 2020, 46(1): 136-143(in Chinese). [15] 李序, 连一峰, 张海霞, 等. 网络安全知识图谱关键技术[J]. 数据与计算发展前沿, 2021, 3(3): 9-18.LI X, LIAN Y F, ZHANG H X, et al. Key technologies of cyber security knowledge graph[J]. Frontiers of Data & Computing, 2021, 3(3): 9-18(in Chinese). [16] 刘强, 祝鹏程. 基于联合学习的端到端威胁情报知识图谱构建方法[J]. 现代计算机, 2021(16): 16-21. doi: 10.3969/j.issn.1007-1423.2021.16.004LIU Q, ZHU P C. End-to-end threat intelligence knowledge graph construction method based on joint learning[J]. Modern Computer, 2021(16): 16-21(in Chinese). doi: 10.3969/j.issn.1007-1423.2021.16.004 [17] 李涛. 威胁情报知识图谱构建与应用关键技术研究[D]. 郑州: 战略支援部队信息工程大学, 2020: 21-74.LI T. Research on key technologies for construction and application of threat intelligence knowledge graph[D]. Zhengzhou: Information Engineering University, 2020: 21-74(in Chinese). [18] 王一琁. 基于知识图谱的网络安全态势感知技术研究与实现[D]. 成都: 电子科技大学, 2020: 18-68.WANG Y Q. Research and implementation of NSSA technology based on knowledge graph[D]. Chengdu: University of Electronic Science and Technology of China, 2020: 18-68(in Chinese). [19] LU X F, ZHOU X, WANG W T, et al. Domain-oriented topic discovery based on features extraction and topic clustering[J]. IEEE Access, 2020, 8: 93648-93662. doi: 10.1109/ACCESS.2020.2994516 [20] HOSSEN M I, ISLAM A, ANOWAR F, et al. Generating cyber threat intelligence to discover potential security threats using classification and topic modeling[EB/OL]. (2021-08-16)[2022-07-14]. [21] 陈斌. 基于注意力机制的网络表示学习算法研究[D]. 合肥: 中国科学技术大学, 2020: 13-32.CHEN B. Network representation learning algorithm research based on attention mechanism[D]. Hefei: University of Science and Technology of China, 2020: 13-32(in Chinese). [22] PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: Online learning of social representations[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710. [23] BRUNA J, ZAREMBA W, SZLAM A, et al. Spectral networks and locally connected networks on graphs[EB/OL]. (2013-12-21)[2022-07-14]. [24] TANG J, QU M, WANG M Z, et al. LINE: Large-scale information network embedding[C]//Proceedings of the International Conference on World Wide Web. New York: ACM, 2015: 1067-1077. [25] WANG D X, CUI P, ZHU W W. Structural deep network embedding[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1225-1234. [26] DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 135-144. [27] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2016-09-09)[2022-07-20]. [28] HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York: ACM, 2017: 1025–1035. [29] LYU Q S, DING M, LIU Q, et al. Are we really making much progress? Revisiting, benchmarking and refining heterogeneous graph neural networks[C]//Proceedings of the ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York: ACM, 2021: 1150-1160.