一种基于图注意力机制的威胁情报归因方法

王婷; 严寒冰; 郎波

doi:10.13700/j.bh.1001-5965.2022.0590

一种基于图注意力机制的威胁情报归因方法

doi: 10.13700/j.bh.1001-5965.2022.0590

王婷^{1, 2},
严寒冰^2, ,,
郎波¹

1.
北京航空航天大学计算机学院，北京 100191
2.
国家计算机网络应急技术处理协调中心，北京 100029

基金项目: 国家重点研发计划(2019QY1400)

详细信息

通讯作者:
E-mail：yhb@cert.org.cn

中图分类号: TP391
计量
- 文章访问数: 29
- HTML全文浏览量: 8
- PDF下载量: 3
- 被引次数: 0
出版历程
- 收稿日期: 2022-07-05
- 录用日期: 2022-09-09
- 网络出版日期: 2022-12-28
- 整期出版日期: 2024-07-18

Threat intelligence attribution method based on graph attention mechanism

WANG Ting^{1, 2},
YAN Hanbing^{2
, ,},
LANG Bo¹

1.
School of Computer Science and Engineering，Beihang University，Beijing 100191，China
2.
National Computer Network Emergency Response Technical Team/Coordination Center of China，Beijing 100029，China

Funds: National Key Research and Development Program of China (2019QY1400)

More Information

Corresponding author: E-mail：yhb@cert.org.cn

摘要

摘要:
威胁情报关联分析已成为网络攻击溯源的有效方式。从公开威胁情报源爬取了不同高级持续性威胁（APT）组织的威胁情报分析报告，并提出一种基于图注意力机制的威胁情报报告归类的方法，目的是检测新产生的威胁情报分析报告类别是否为已知的攻击组织，从而有助于进一步的专家分析。通过设计威胁情报知识图谱，提取战术和技术情报，对恶意样本、IP和域名进行属性挖掘，构建复杂网络，使用图注意力神经网络进行威胁情报报告节点分类。评估表明：所提方法在考虑类别分布不均衡的情况下，可以达到78%的准确率，达到对威胁情报报告所属组织进行有效判定的目的。
- 威胁情报 /
- 高级持续性威胁组织 /
- 知识图谱 /
- 图注意力机制 /
- 攻击溯源
Abstract:
Threat intelligence correlation analysis has become an effective way to trace the source of cyber attacks. The threat intelligence analysis reports of different advanced persistent threat (APT) organizations were crawled from the public threat intelligence sources, and a threat intelligence report classification method based on graph attention mechanism was proposed, which was to detect whether the newly generated threat intelligence analysis report categories were known attack organizations, so as to facilitate further expert analysis. By designing a threat intelligence knowledge graph, extracting tactical and technical intelligence, mining the attributes of malicious samples, IPs and domain names, constructing a complex network, and using the graph attention neural network to classify the threat intelligence reporting nodes. Evaluation indicates that the method can achieve an accuracy rate of 78% while considering the uneven distribution of categories, which can effectively achieve the purpose of judging the organization to which the threat intelligence report belongs.
- threat intelligence /
- advanced persistent threat organization /
- knowledge graph /
- graph attention mechanism /
- attack source tracing

HTML全文

图 1 本文方法工作流程

Figure 1. Workflow of the proposed method

下载: 全尺寸图片幻灯片

图 2 威胁情报知识图谱结构

Figure 2. Threat intelligence knowledge graph structure

下载: 全尺寸图片幻灯片

图 3 TRAM工具使用样例

Figure 3. Application example of TRAM tool

下载: 全尺寸图片幻灯片

图 4 威胁情报异构网络映射为同构网络

Figure 4. Heterogeneous threat intelligence networks mapped to homogeneous networks

下载: 全尺寸图片幻灯片

图 5 图注意力机制消息聚合示意图

Figure 5. Schematic diagram of message aggregation of graph attention mechanism

下载: 全尺寸图片幻灯片

图 6 同构图注意力机制模型

Figure 6. Homogeneous graph attention mechanism model

下载: 全尺寸图片幻灯片

图 7 FireEye发布的与APT32相关的威胁情报报告示例

Figure 7. Example of threat intelligence report related to APT32 released by FireEye

下载: 全尺寸图片幻灯片

图 8 威胁情报实体数量

Figure 8. Statistics of threat intelligence entities

下载: 全尺寸图片幻灯片

图 9 威胁情报关系数量

Figure 9. Statistics of threat intelligence relations

下载: 全尺寸图片幻灯片

图 10 APT攻击组织分析报告统计

Figure 10. Statistics of APT organization analysis report

下载: 全尺寸图片幻灯片

图 11 多头注意力机制准确率

Figure 11. Accuracy of multi-head attention mechanisms

下载: 全尺寸图片幻灯片

表 1 部分IOC正则表达式

Table 1. Some examples of IOC regular expressions

实体类型	正则表达式
MD5	[a-f 0-9]{32}\|[A-F 0-9]{32}
SHA1	[a-f 0-9]{40}\|[A-F 0-9]{40}
SHA256	[a-f0-9]{64}\|[A-F0-9]{64}
CVE	CVE-[0-9]{4}-[0-9]{4,6}
IP	((25[0-5]\|2[0-4]\d\|((1\d{2})\|([1-9]?\d)))\.){3} (25[0-5]\|2[0-4]\d\|((1\d{2})\|([1-9]?\d)))
Domain	[a-zA-Z0-9][-a-zA-Z0-9]{0,62}\.+?)([a-zA-Z][-a-zA-Z]{0,62}

下载: 导出CSV

表 2 APT部分组织别名

Table 2. APT part organization aliases

APT组织	别名
Sofacy	APT28, PawnStorm, PawnStorm, FancyBear,Sednit, SNAKEMACKEREL, TsarTeam, TsarTeam, TG-4127, Group-4127, STRONTIUM, TAG_0700, Swallowtail,IRONTWILIGHT, Group74, SIG40, GrizzlySteppe, apt_sofacy
BITTER	T-APT-17, APT-C-08, 蔓灵花
APT32	海莲花、OceanLotusGroup, OceanLotus, CobaltKitty, APT-C-00, SeaLotus, SeaLotus, APT-32, OceanBuffalo, PONDLOACH, TINWOODLAWN
Confucius	孔夫子
SideWinder	响尾蛇

下载: 导出CSV

表 3 本文使用的威胁情报源

Table 3. Threat intelligence sources used in this paper

威胁情报源	厂商名称
国内安全厂商	绿盟，奇安信，360，微步在线，安天
国外安全厂商	MITRE ATT&CK，CheckPoint，CrowdStrike，Microsoft，Trend Micro,Symantec，FireEye，Kaspersky，Welivesecurity, Malwarebytes, Mandiant,Ahnlab，VirusTotal
Github开源项目	APTnotes

下载: 导出CSV

表 4 模型超参数设置

Table 4. Super parameter settings of model

参数	数值
seed	72
weight_decay	5×10⁴
nb_head	32
α	0.2
l_r	0.005
hidden	8
dropout	0.3
patience	200

下载: 导出CSV

表 5 性能评估指标

Table 5. Performance evaluation indicators

组织	P	R	F₁	报告数量
APT29	0.70	0.89	0.78	18
APT32	0.80	0.67	0.73	24
APT33	0.20	0.25	0.22	4
APT34	0.90	0.76	0.83	25
APT37	0.75	0.60	0.67	5
BITTER	0.57	0.50	0.53	8
Cobalt	1.00	0.33	0.50	6
Confucius	0.90	0.82	0.86	11
DarkHotel	0.15	0.67	0.25	3
FIN6	0.60	0.75	0.67	4
FIN7	0.57	0.50	0.53	16
Kimsuky	0.29	0.20	0.24	10
Lazarus	0.77	0.90	0.83	106
MuddyWater	0.88	0.84	0.86	45
ProjectSauron	0.80	0.80	0.80	10
Shammon	1.00	0.75	0.86	8
SideWinder	0.67	0.50	0.57	8
Sofacy	0.84	0.90	0.87	41
StrongPity	1.00	0.85	0.92	46
TeamTNT	1.00	0.14	0.25	7
PROMETHIUM	0.83	1.00	0.91	5
TA505	0.79	0.82	0.81	33
Accuracy			0.78	443
Macro avg	0.73	0.66	0.66	443
Micro avg	0.78	0.78	0.78	443
Weighted avg	0.80	0.78	0.78	443

下载: 导出CSV

表 6 方法对比

Table 6. Methods comparison

方法	P	R	F₁
GCN	0.744	0.744	0.744
GraphSage	0.752	0.748	0.749
GAT	0.780	0.780	0.780

下载: 导出CSV

参考文献(29)

[1]	尹彦, 张红斌, 刘滨, 等. 网络安全态势感知中的威胁情报技术[J]. 河北科技大学学报, 2021, 42(2): 195-204. doi: 10.7535/hbkd.2021yx02012 YIN Y, ZHANG H B, LIU B, et al. Threat intelligence technology in network security situation awareness[J]. Journal of Hebei University of Science and Technology, 2021, 42(2): 195-204(in Chinese). doi: 10.7535/hbkd.2021yx02012
[2]	王淮, 杨天长. 网络威胁情报关联分析技术[J]. 信息技术, 2021, 45(2): 26-32. WANG H, YANG T C. Network threat intelligence correlation analysis technology[J]. Information Technology, 2021, 45(2): 26-32(in Chinese).
[3]	赵宁, 李蕾, 刘青春, 等. 基于网络开源情报的威胁情报分析与管理[J]. 情报杂志, 2021, 40(11): 16-22. doi: 10.3969/j.issn.1002-1965.2021.11.003 ZHAO N, LI L, LIU Q C, et al. Analysis and management of threat intelligence based on OSINT[J]. Journal of Intelligence, 2021, 40(11): 16-22(in Chinese). doi: 10.3969/j.issn.1002-1965.2021.11.003
[4]	党超辉, 马志伟, 邵国飞, 等. 基于大数据与威胁情报的防御体系研究[J]. 计算机与网络, 2021, 47(15): 46-47. doi: 10.3969/j.issn.1008-1739.2021.15.043 DANG C H, MA Z W, SHAO G F, et al. Research on defense system based on big data and threat intelligence[J]. Computer & Network, 2021, 47(15): 46-47(in Chinese). doi: 10.3969/j.issn.1008-1739.2021.15.043
[5]	何志鹏, 刘鹏, 王鹤. 网络威胁情报标准化建设分析[J]. 信息安全研究, 2021, 7(6): 503-511. doi: 10.3969/j.issn.2096-1057.2021.06.004 HE Z P, LIU P, WANG H. Analysis on standardization construction of cyber threat intelligence[J]. Journal of Information Security Research, 2021, 7(6): 503-511(in Chinese). doi: 10.3969/j.issn.2096-1057.2021.06.004
[6]	张红斌, 尹彦, 赵冬梅, 等. 基于威胁情报的网络安全态势感知模型[J]. 通信学报, 2021, 42(6): 182-194. ZHANG H B, YIN Y, ZHAO D M, et al. Network security situational awareness model based on threat intelligence[J]. Journal on Communications, 2021, 42(6): 182-194(in Chinese).
[7]	赵阳. 绿盟威胁情报解决方案[J]. 信息安全与通信保密, 2020, 18(S1): 23-28. doi: 10.3969/j.issn.1009-8054.2020.z1.006 ZHAO Y. Green alliance threat intelligence solution[J]. Information Security and Communications Privacy, 2020, 18(S1): 23-28(in Chinese). doi: 10.3969/j.issn.1009-8054.2020.z1.006
[8]	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2017-10-30)[2022-07-12]. http://arxiv.org/abs/1710.10903.
[9]	黄克振, 连一峰, 冯登国, 等. 一种基于图模型的网络攻击溯源方法[J]. 软件学报, 2022, 33(2): 683-698. HUANG K Z, LIAN Y F, FENG D G, et al. Method of cyber attack attribution based on graph model[J]. Journal of Software, 2022, 33(2): 683-698(in Chinese).
[10]	蔡扬. 基于网络行为分析的单机木马检测技术的研究与实现[D]. 杭州: 浙江大学, 2016. CAI Y. Research and implementation of Trojan detection technology base on network behavior analysis[D]. Hangzhou: Zhejiang University, 2016(in Chinese).
[11]	PERRY L, SHAPIRA B, PUZIS R. NO-DOUBT: Attack attribution based on threat intelligence reports[C]//Proceedings of the IEEE International Conference on Intelligence and Security Informatics. Piscataway: IEEE Press, 2019: 80-85.
[12]	NAVEEN S, PUZIS R, ANGAPPAN K. Deep learning for threat actor attribution from threat reports[C]//Proceedings of the International Conference on Computer, Communication and Signal Processing. Piscataway: IEEE Press, 2020: 1-6.
[13]	丁兆云, 刘凯, 刘斌, 等. 网络安全知识图谱研究综述[J]. 华中科技大学学报(自然科学版), 2021, 49(7): 79-91. DING Z Y, LIU K, LIU B, et al. Survey of cyber security knowledge graph[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49(7): 79-91(in Chinese).
[14]	杨沛安, 刘宝旭, 杜翔宇. 面向攻击识别的威胁情报画像分析[J]. 计算机工程, 2020, 46(1): 136-143. YANG P A, LIU B X, DU X Y. Portrait analysis of threat intelligence for attack recognition[J]. Computer Engineering, 2020, 46(1): 136-143(in Chinese).
[15]	李序, 连一峰, 张海霞, 等. 网络安全知识图谱关键技术[J]. 数据与计算发展前沿, 2021, 3(3): 9-18. LI X, LIAN Y F, ZHANG H X, et al. Key technologies of cyber security knowledge graph[J]. Frontiers of Data & Computing, 2021, 3(3): 9-18(in Chinese).
[16]	刘强, 祝鹏程. 基于联合学习的端到端威胁情报知识图谱构建方法[J]. 现代计算机, 2021(16): 16-21. doi: 10.3969/j.issn.1007-1423.2021.16.004 LIU Q, ZHU P C. End-to-end threat intelligence knowledge graph construction method based on joint learning[J]. Modern Computer, 2021(16): 16-21(in Chinese). doi: 10.3969/j.issn.1007-1423.2021.16.004
[17]	李涛. 威胁情报知识图谱构建与应用关键技术研究[D]. 郑州: 战略支援部队信息工程大学, 2020: 21-74. LI T. Research on key technologies for construction and application of threat intelligence knowledge graph[D]. Zhengzhou: Information Engineering University, 2020: 21-74(in Chinese).
[18]	王一琁. 基于知识图谱的网络安全态势感知技术研究与实现[D]. 成都: 电子科技大学, 2020: 18-68. WANG Y Q. Research and implementation of NSSA technology based on knowledge graph[D]. Chengdu: University of Electronic Science and Technology of China, 2020: 18-68(in Chinese).
[19]	LU X F, ZHOU X, WANG W T, et al. Domain-oriented topic discovery based on features extraction and topic clustering[J]. IEEE Access, 2020, 8: 93648-93662. doi: 10.1109/ACCESS.2020.2994516
[20]	HOSSEN M I, ISLAM A, ANOWAR F, et al. Generating cyber threat intelligence to discover potential security threats using classification and topic modeling[EB/OL]. (2021-08-16)[2022-07-14]. http://arxiv.org/abs/2108.06862.
[21]	陈斌. 基于注意力机制的网络表示学习算法研究[D]. 合肥: 中国科学技术大学, 2020: 13-32. CHEN B. Network representation learning algorithm research based on attention mechanism[D]. Hefei: University of Science and Technology of China, 2020: 13-32(in Chinese).
[22]	PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: Online learning of social representations[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710.
[23]	BRUNA J, ZAREMBA W, SZLAM A, et al. Spectral networks and locally connected networks on graphs[EB/OL]. (2013-12-21)[2022-07-14]. https://arxiv.org/abs/1312.6203.
[24]	TANG J, QU M, WANG M Z, et al. LINE: Large-scale information network embedding[C]//Proceedings of the International Conference on World Wide Web. New York: ACM, 2015: 1067-1077.
[25]	WANG D X, CUI P, ZHU W W. Structural deep network embedding[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1225-1234.
[26]	DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 135-144.
[27]	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2016-09-09)[2022-07-20]. http://arxiv.org/abs/1609.02907.
[28]	HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York: ACM, 2017: 1025–1035.
[29]	LYU Q S, DING M, LIU Q, et al. Are we really making much progress? Revisiting, benchmarking and refining heterogeneous graph neural networks[C]//Proceedings of the ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York: ACM, 2021: 1150-1160.