Prediction of aviation safety event risk level based on ensemble cost-sensitive deep neural network
-
摘要:
航空安全事件风险等级预测是主动风险管理的重要手段。考虑海量航空安全事件数据呈现的高维复杂、类不平衡等特性,提出一种基于集成代价敏感深度神经网络(ECSDNN)的航空安全事件风险等级预测方法。采用分类型属性嵌入特征编码和数值型属性拼接的方法实现航空安全事件数据的特征表示;综合考虑错分比例和固定代价设计代价敏感矩阵和代价敏感损失函数,构建基于代价敏感深度神经网络(CSDNN)的基分类器模型;采用硬投票方法,集成多个参数不同、性能各异的基分类器,构建航空安全事件风险等级预测模型。在航空安全事件报告系统(ASRS)数据集上的实验结果表明:相比基准算法,所提ECSDNN模型的预测准确率提升了4.51%;相比单个CSDNN基分类器,所提ECSDNN模型的预测准确率提升了3.17%。验证了基于ECSDNN的航空安全事件风险等级预测方法的有效性。
Abstract:One key component of active risk management is the prediction of aviation safety event risk levels.Considering the characteristics of high-dimensional complexity and class imbalance presented by massive aviation safety event data, this paper proposes an aviation safety event risk level prediction method based on an ensemble cost-sensitive deep neural network (ECSDNN). First, the feature representation of aviation safety event data is realized by using the method of splicing type attribute embedding coding and numerical attribute; secondly, a cost-sensitive matrix and a cost-sensitive loss function are designed comprehensively considering the misclassification ratio and fixed cost, and a base classifier model based on a cost-sensitive deep neural network (CSDNN) is constructed; finally, an ensemble prediction model of aviation safety event risk level ECSDNN is created by integrating various base classifiers with varying parameters and performances using the hard voting approach.The experimental results on the aviation safety reporting system (ASRS) dataset demonstrate that the prediction accuracy of the ECSDNN model is improved by 3.17% when compared with the single CSDNN base classifier and by 4.51% when compared with the optimal prediction ability of the benchmark algorithm.The effectiveness of the ensemble cost-sensitive deep neural network method for aviation safety event risk level prediction is verified.
-
表 1 ASRS报告示例
Table 1. ASRS report example
属性组 属性 属性值 时间 日期 202007 当地时间 1801—2400 地点 参考地点 ZMP.ARTCC 州参考 NM AGL高度 5181.6 m 环境 飞行条件 VMC 天气元素/能见度 10 光 Daylight 飞机 航空交通管制 Center ZMP 飞机操作者 Air Carrier 创建模型名称 Commercial Fixed Wing 人员数量 2 操作部分 Part 121 飞行计划 IFR 任务 Passenger 飞行阶段 Climb 空域 Class E ZMP 组成 飞机部件 Pressurization Control System 参考 X 问题 Improperly Operated 人物 人员所在位置 Facility ZMP.ARTCC 报告组织 Government 功能模块 Enroute; Trainee 资质 Air Traffic Control Developmental 人为因素 Situational Awareness 事件 异常 ATC Issue All Types; Airspace Violation All Types; Deviation/Discrepancy-Procedural Published Material/Policy 探测者 Person Air Traffic Control 场合 In-flight 结果 General None Reported/Taken 评估 影响因素/情况 Environment-Non Weather Related; Human Factors; Airspace Structure 主要问题 Human Factors 注:信息来源于https://asrs.arc.nasa.gov/search/database.html。 表 2 基于事件结果的事件风险等级标注
Table 2. Event risk level labeling based on event results
风险等级 事件结果 低风险(0) 涉及一般警察及保安;空中交通管制部门提供了协助;飞机设备问题消失;机组人员返回登机口/机组人员FLC遵从自动化建议;一般没有报告被采取 中低风险(1) 通用维修行动/通用航班取消延迟;通用放行拒绝飞机未被接受;机组人员请求空中交通管制协助澄清;机组人员凌驾于自动控制/机场飞机自动化凌驾于机组人员/机组人员FLC凌驾于自动控制;机组人员退出被侵入的空域/安全着陆/返回安全区域/返回起飞机场 中风险(2) 一般工作被拒绝;机组人员重新归位;机组人员转移/执行返航错过进场;空中交通管制签发了新的许可;机组人员克服设备问题/拒绝起飞/采取规避行动 中高风险(3) 一般撤离;机组人员重新控制飞机;管制空中交通管制发出通告;机组人员紧急降落;管制空中交通管制发出警报 高风险(4) 一般紧急状态;一般身体受伤/丧失行动能;机组人员飞行停飞;航空管制分离交通管制;飞机损坏 表 3 风险等级界定
Table 3. Risk level definition
事件结果 风险等级 飞机损坏 高风险(4) 一般撤离 中高风险(3) 一般工作被拒绝 中风险(2) 表 4 航空安全事件预测用属性
Table 4. Attributes for aviation safety event prediction
事件属性组 事件属性 地点 当地时间、州参考、相对角度、相对距离、AGL高度、MSL高度 飞机1 航空交通管制、飞机操作者、创建模型名称、人员数量、操作部分、飞行计划、任务、飞行阶段、飞行路线、领空、导航 飞机2 航空交通管制、飞机操作者、创建模型名称 人物1 人员所在位置、在飞机上的位置、报告组织、功能模块、资质、人为因素、沟通中断 人物2 人员所在位置、在飞机上的位置、报告组织、功能模块、资质、人为因素、沟通中断 环境 飞机状态、天气元素\能见度、工作环境因素、光、最高飞行限度 组成 飞机部件、参考、制造商、问题 时间 当地时间 事件 异常、探测者、是否有乘客参与、场合 评估 影响因素\情况、主要问题 表 5 基分类器参数设置
Table 5. Base classifier parameter settings
模型 损失函数 epoch $\eta $ ${\beta _1}$ ${\beta _2}$ $\alpha $ $\delta $ CSDNN1 LS 200 10 0 1 2 10 CSDNN2 FL 200 10 0 1 2 10 CSDNN3 FL 200 10 0 1 1 1 CSDNN4 CE 200 30 0.5 0.5 1 1 CSDNN5 CE 200 20 0.5 0.5 1 1 CSDNN6 CE 200 10 0.5 0.5 1 1 CSDNN7 CE 100 10 0 1 1 1 CSDNN8 CE 100 10 0 1 1 1 表 6 不同算法对比实验结果
Table 6. Comparison of experimental results with different algorithms
模型 PMacro RMacro F1_Macro PWeighted RWeighted F1_Weighted ACC/% RF 0.8083 0.8007 0.8023 0.8161 0.8029 0.8074 80.29 KNN 0.5827 0.5710 0.5717 0.6048 0.5770 0.5863 57.70 NBC 0.5793 0.5769 0.5745 0.5903 0.5755 0.5791 57.55 DTC 0.8033 0.7935 0.7941 0.8187 0.7971 0.8042 79.71 LRC 0.6814 0.6761 0.6768 0.6881 0.6773 0.6807 67.73 RNN 0.7894 0.7952 0.7897 0.7874 0.7902 0.7861 79.02 ECSDNN 0.8478 0.8517 0.8493 0.8458 0.8480 0.8464 84.80 表 7 不同风险等级混淆矩阵结果及准确率
Table 7. Confusion matrix results and accuracy of different risk levels
风险等级 混淆矩阵值 ACC/% 低风险 中低风险 中风险 中高风险 高风险 低风险 1541 31 160 15 31 86.67 中低风险 16 1619 53 9 36 93.42 中风险 209 120 1345 140 135 69.01 中高风险 26 14 128 1549 39 88.21 高风险 40 32 85 43 1544 88.53 表 8 CSDNN基分类器消融实验结果
Table 8. CSDNN base classifier ablation experiment result
模型 PMacro RMacro F1_Macro PWeighted RWeighted F1_Weighted ACC/% DNN 0.7959 0.8048 0.7976 0.7939 0.7991 0.7935 79.91 CSDNN 0.8078 0.8155 0.8087 0.8062 0.8101 0.8050 81.01 表 9 消融实验结果
Table 9. Ablation experiment results
模型 PMacro RMacro F1_Macro PWeighted RWeighted F1_Weighted ACC/% CSDNN1 0.8144 0.8211 0.8165 0.8120 0.8163 0.8128 81.63 CSDNN2 0.8099 0.8163 0.8117 0.8080 0.8117 0.8084 81.17 CSDNN3 0.8075 0.8142 0.8091 0.8056 0.8094 0.8057 80.94 CSDNN4 0.8075 0.8159 0.8096 0.8054 0.8105 0.8058 81.05 CSDNN5 0.8102 0.8164 0.8122 0.8078 0.8117 0.8087 81.17 CSDNN6 0.8140 0.8201 0.8159 0.8119 0.8156 0.8126 81.56 CSDNN7 0.8106 0.8177 0.8128 0.8083 0.8128 0.8092 81.28 CSDNN8 0.8085 0.8158 0.8106 0.8064 0.8108 0.8070 81.08 ECSDNN 0.8478 0.8517 0.8493 0.8458 0.8480 0.8464 84.80 表 10 属性合理性实验结果
Table 10. Attribute rationality test results
属性数量 PMacro RMacro F1_Macro PWeighted RWeighted F1_Weighted ACC/% 40 0.7860 0.7950 0.7878 0.7840 0.7893 0.7838 78.93 50 0.8172 0.8232 0.8189 0.8151 0.8187 0.8158 81.87 -
[1] HU X, WU J, HE J R. Textual indicator extraction from aviation accident reports: AIAA 2019-2939[R]. Reston: AIAA, 2019. [2] ROBINSON S. Multi-label classification of contributing causal factors in self-reported safety narratives[J]. Safety, 2018, 4(3): 30. doi: 10.3390/safety4030030 [3] ROSE R L, PURANIK T G, MAVRIS D N. Natural language processing based method for clustering and analysis of aviation safety narratives[J]. Aerospace, 2020, 7(10): 143. doi: 10.3390/aerospace7100143 [4] SUBRAMANIAN S V, RAO A H. Deep-learning based time series forecasting of go-around incidents in the national airspace system: AIAA 2018-0424[R]. Reston: AIAA, 2018. [5] PARADIS C, KAZMAN R, DAVIES M, et al. Augmenting topic finding in the NASA aviation safety reporting system using topic modeling: AIAA 2021-1981[R]. Reston: AIAA, 2021. [6] TANGUY L, TULECHKI N, URIELI A, et al. Natural language processing for aviation safety reports: From classification to interactive analysis[J]. Computers in Industry, 2016, 78: 80-95. doi: 10.1016/j.compind.2015.09.005 [7] KUHN K D. Using structural topic modeling to identify latent topics and trends in aviation incident reports[J]. Transportation Research Part C:Emerging Technologies, 2018, 87: 105-122. doi: 10.1016/j.trc.2017.12.018 [8] YAN W L, ZHOU J H. Early fault detection of aircraft components using flight sensor data[C]//Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation. Piscataway: IEEE Press, 2018: 1337-1342. [9] JANAKIRAMAN V M, NIELSEN D. Anomaly detection in aviation data using extreme learning machines[C]//Proceedings of the International Joint Conference on Neural Networks. Piscataway: IEEE Press, 2016: 1993-2000. [10] LIU Y F, LV J H, MA S L. A real time anomaly detection method based on variable n-gram for flight data[C]//Proceedings of the IEEE International Conference on High Performance Computing and Communications, IEEE International Conference on Smart City, IEEE International Conference on Data Science and Systems. Piscataway: IEEE Press, 2018: 370-376. [11] 冯霞, 李娟娟, 闫冠男. 关联规则挖掘在航空安全报告分析中的应用[J]. 计算机工程与设计, 2011, 32(1): 218-220.FENG X, LI J J, YAN G N. Application of association rules mining in aviation safety reports analysis[J]. Computer Engineering and Design, 2011, 32(1): 218-220(in Chinese). [12] 刘俊杰, 李华明, 梁文娟, 等. 基于内容分析法的航空安全自愿报告信息分析[J]. 中国安全科学学报, 2012, 22(4): 90-96. doi: 10.3969/j.issn.1003-3033.2012.04.016LIU J J, LI H M, LIANG W J, et al. Analysis of aviation safety confidential reports based on content analysis method[J]. China Safety Science Journal, 2012, 22(4): 90-96(in Chinese). doi: 10.3969/j.issn.1003-3033.2012.04.016 [13] 刘俊杰, 杜尹岚, 闫慧娟. Python环境下的航空安全报告信息分析方法[J]. 科学技术与工程, 2021, 21(10): 4278-4283. doi: 10.3969/j.issn.1671-1815.2021.10.061LIU J J, DU Y L, YAN H J. The analysis method of aviation safety reporting information based on Python[J]. Science Technology and Engineering, 2021, 21(10): 4278-4283(in Chinese). doi: 10.3969/j.issn.1671-1815.2021.10.061 [14] 宁静, 佘红艳, 赵东, 等. 一种路网级交通事故风险预测方法[J]. 北京邮电大学学报, 2022, 45(2): 72-78.NING J, SHE H Y, ZHAO D, et al. A road-level traffic accident risk prediction method[J]. Journal of Beijing University of Posts and Telecommunications, 2022, 45(2): 72-78(in Chinese). [15] 柳本民, 廖岩枫, 涂辉招, 等. 基于模拟实验的低等级公路车辆过弯风险预测模型[J]. 同济大学学报(自然科学版), 2021, 49(4): 499-506. doi: 10.11908/j.issn.0253-374x.20266LIU B M, LIAO Y F, TU H Z, et al. Risk prediction model of vehicle driving in small radius curves based on simulation experiment[J]. Journal of Tongji University (Natural Science), 2021, 49(4): 499-506(in Chinese). doi: 10.11908/j.issn.0253-374x.20266 [16] 赵海涛, 程慧玲, 丁仪, 等. 基于深度学习的车联边缘网络交通事故风险预测算法研究[J]. 电子与信息学报, 2020, 42(1): 50-57.ZHAO H T, CHENG H L, DING Y, et al. Research on traffic accident risk prediction algorithm of edge Internet of vehicles based on deep learning[J]. Journal of Electronics & Information Technology, 2020, 42(1): 50-57(in Chinese). [17] 赵晓华, 亓航, 姚莹, 等. 基于可解释机器学习框架的快速路立交出口风险预测及致因解析[J]. 东南大学学报(自然科学版), 2022, 52(1): 152-161. doi: 10.3969/j.issn.1001-0505.2022.01.020ZHAO X H, QI H, YAO Y, et al. Risk prediction and causation analysis of expressway interchange exits based on interpretable machine learning framework[J]. Journal of Southeast University (Natural Science Edition), 2022, 52(1): 152-161(in Chinese). doi: 10.3969/j.issn.1001-0505.2022.01.020 [18] 曾小清, 林海香, 王奕曾, 等. 基于事故数据的轨道交通运行安全风险辨识方法[J]. 同济大学学报(自然科学版), 2022, 50(3): 418-424. doi: 10.11908/j.issn.0253-374x.21437ZENG X Q, LIN H X, WANG Y Z, et al. Safety risk identification of rail transit signaling system based on accident data[J]. Journal of Tongji University (Natural Science), 2022, 50(3): 418-424(in Chinese). doi: 10.11908/j.issn.0253-374x.21437 [19] SRINIVASAN P, NAGARAJAN V, MAHADEVAN S. Mining and classifying aviation accident reports: AIAA 2019-2938[R]. Reston: AIAA, 2019. [20] ALKHAMISI A O, MEHMOOD R. An ensemble machine and deep learning model for risk prediction in aviation systems[C]//Proceedings of the 6th Conference on Data Science and Machine Learning Applications. Piscataway: IEEE Press, 2020: 54-59. [21] 倪晓梅, 王华伟, 熊明兰, 等. 基于文本挖掘的民航事件风险评估[J]. 湖南大学学报(自然科学版), 2022, 49(6): 73-79.NI X M, WANG H W, XIONG M L, et al. Civil aviation incident risk assessment based on text mining[J]. Journal of Hunan University (Natural Sciences), 2022, 49(6): 73-79(in Chinese). [22] ZHANG X G, MAHADEVAN S. Ensemble machine learning models for aviation incident risk prediction[J]. Decision Support Systems, 2019, 116: 48-63. [23] FANG Y. Feature selection, deep neural network and trend prediction[J]. Journal of Shanghai Jiaotong University (Science), 2018, 23(2): 297-307. doi: 10.1007/s12204-018-1938-5 [24] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. doi: 10.1038/nature14539 [25] 吴雨茜, 王俊丽, 杨丽, 等. 代价敏感深度学习方法研究综述[J]. 计算机科学, 2019, 46(5): 1-12. doi: 10.11896/j.issn.1002-137X.2019.05.001WU Y X, WANG J L, YANG L, et al. Survey on cost-sensitive deep learning methods[J]. Computer Science, 2019, 46(5): 1-12(in Chinese). doi: 10.11896/j.issn.1002-137X.2019.05.001 [26] RUSKIN K J, CORVIN C, RICE S, et al. Alarms, alerts, and warnings in air traffic control: An analysis of reports from the aviation safety reporting system[J]. Transportation Research Interdisciplinary Perspectives, 2021, 12: 100502. doi: 10.1016/j.trip.2021.100502 [27] 刘梦娜. 基于文本挖掘的航空安全事故报告致因因素分析和风险预测[D]. 合肥: 安徽建筑大学, 2019.LIU M N. Analysis of influencing factors and risk prediction of aviation safety accident report based on text mining[D]. Hefei: Anhui Jianzhu University, 2019(in Chinese). [28] International Civil Aviation Organization. Safety managementmanual(SMM)[EB/OL]. [2022-08-28]. https://www.icao.int/NACC/Documents/Meetings/2014/SSPSMSANT/Doc9859.pdf#search=doc9859. [29] CARMONA M. What is the NASA ASRS?[EB/OL]. [2022-04-11]. https://asrs.arc.nasa.gov/uassafety.html. [30] 万建武, 杨明. 代价敏感学习方法综述[J]. 软件学报, 2020, 31(1): 113-136.WAN J W, YANG M. Survey on cost-sensitive learning method[J]. Journal of Software, 2020, 31(1): 113-136(in Chinese). [31] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007. [32] MÜLLER R, KORNBLITH S, HINTON G. When does label smoothing help?[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York: ACM, 2019: 4694-4703.