-
摘要:
液氧液氢发动机作为航天领域的关键分系统,为有效开展其研制过程的智能化转型工作,针对液氧液氢发动机领域构建知识图谱,在储备领域知识的同时,提高科研生产人才的培养能力。针对液氧液氢发动机领域特点,对领域语料标注、领域知识识别、实体关系识别3个方面进行了研究,在此基础上进行了领域知识图谱的构建,并从领域知识搜索、知识推荐、探索式分析3个角度梳理了其业务应用模式。形成了液氧液氢发动机领域知识体系,并提出构建方法和应用模式。研究结果为航天领域智能化转型提供了参考。
Abstract:As a crucial component of the aerospace industry, we construct the domain knowledge graph in the domain of liquid oxygen and liquid hydrogen engines in order to maintain domain knowledge and efficiently enhance the training capacity of scientific research and production talents. According to the characteristics of this domain, three aspects of domain corpus labeling, domain entity recognition, and entity relationship recognition are studied. Based on the research results, the construction of a domain knowledge graph is carried out, and the application mode from three perspectives is sorted out: domain knowledge search, knowledge recommendation, and exploratory analysis. Ultimately, the building methods and application modes are proposed, and the knowledge system in the field of liquid oxygen and liquid hydrogen engines is developed. These materials serve as a guide for the intelligent transformation of the aerospace sector.
-
表 1 主要扩展属性
Table 1. Main extended attributes
概念类型 属性用途 基本构成 文档知识 描述获取该文档的方式,及其内容信息 文档标题、来源、等级、业务方向、存储路径、文档内容 推进剂 描述推进剂的主要成分 成分、配比、储存条件、储存时长、燃烧效率、能量效率 结构 描述发动机的主要结构技术 点火方式、燃烧室、喷嘴、冷却系统、声学系统、排气系统 APP 描述APP的基本信息和使用信息 应用名、软件平台、供应商、运营商、简介、内容语言 表 2 实验数据情况
Table 2. Experimental data status
数据集 实体数量 实体对 实体关系 训练语料 32 384 834 629 60 947 开发语料 9 102 90 640 5 275 测试语料 11 286 103 809 7 206 表 3 知识实体识别模型参数设置
Table 3. Knowledge entity recognition model parameter settings
Transformer
层数向量
维度隐藏
层数Dropout 学习率 批大小 迭代
周期12 256 1024 0.5 3×10−5 32 100 表 4 实体关系识别模型参数设置
Table 4. Parameterisation of the entity relationship recognition model
$\theta ({d})$ $\theta ({w})$ 5 {3,4} 表 5 对照参数组
Table 5. Control parameter group
组别 向量维度 隐藏层数 批大小 $ \theta ({\mathrm{dist}})$ $\theta ({\mathrm{wind}})$ A 128 512 16 4 {3,3} B 128 768 16 4 {3,4} C 128 1024 16 5 {3,5} D 256 512 32 6 {4,5} E 256 768 32 6 {4,6} 表 6 知识实体识别实验结果
Table 6. Results of knowledge entity recognition experiments
组别 P R F1 A 0.852 0.805 0.828 B 0.894 0.846 0.869 C 0.902 0.869 0.885 D 0.879 0.852 0.865 E 0.910 0.869 0.889 实验组 0.908 0.882 0.895 表 7 实体关系识别实验结果
Table 7. Results of the entity relationship recognition experiment
组别 P R F1 A5 A10 A20 A 0.802 0.582 0.675 0.822 0.828 0.830 B 0.789 0.637 0.705 0.797 0.805 0.813 C 0.754 0.783 0.768 0.762 0.785 0.800 D 0.603 0.795 0.686 0.609 0.609 0.615 E 0.585 0.804 0.677 0.627 0.665 0.709 实验组 0.772 0.768 0.770 0.798 0.822 0.835 -
[1] PUJARA J, MIAO H, GETOOR L, et al. Ontology-aware partitioning for knowledge graph identification[C]//Proceedings of the Workshop on Automated Knowledge Base Construction. New York: ACM, 2013: 19-24. [2] 林明. 基于知识图谱的交互关系浏览与分析: 可视化模型与系统实现[D]. 杭州: 浙江大学, 2017.LIN M. Interactive relation exploration and analysis based on knowledge graph: Visualization model and system implementation[D]. Hangzhou: Zhejiang University, 2017(in Chinese). [3] 王萌, 王昊奋, 李博涵, 等. 新一代知识图谱关键技术综述[J]. 计算机研究与发展, 2022, 59(9): 1947-1965.WANG M, WANG H F, LI B H, et al. Survey of key technologies of new generation knowledge graph[J]. Journal of Computer Research and Development, 2022, 59(9): 1947-1965(in Chinese). [4] 姚萍, 李坤伟, 张一帆. 知识图谱构建技术综述[J]. 信息系统工程, 2020(5): 121-123. doi: 10.3969/j.issn.1001-2362.2020.05.054YAO P, LI K W, ZHANG Y F. Summary of knowledge map construction technology[J]. China CIO News, 2020(5): 121-123(in Chinese). doi: 10.3969/j.issn.1001-2362.2020.05.054 [5] CRANEFIELD S. Networked knowledge representation and exchange using UML and RDF[J/OL]. Journal of Digital Information, [2006-01-24][2022-07-03]. [6] GUO Y B, PAN Z X, HEFLIN J. LUBM: A benchmark for OWL knowledge base systems[J]. Journal of Web Semantics, 2005, 3(2-3): 158-182. doi: 10.1016/j.websem.2005.06.005 [7] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: A collaboratively created graph database for structuring human knowledge[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1247-1250. [8] SUCHANEK F M, KASNECI G, WEIKUM G. YAGO: A large ontology from wikipedia and WordNet[J]. Journal of Web Semantics, 2008, 6(3): 203-217. doi: 10.1016/j.websem.2008.06.001 [9] BIZER C, LEHMANN J, KOBILAROV G, et al. DBpedia—A crystallization point for the web of data[J]. Journal of Web Semantics, 2009, 7(3): 154-165. doi: 10.1016/j.websem.2009.07.002 [10] MATUSZEK C, WITBROCK M, CABRAL J, et al. An Introduction to the syntax and content of CyC[J/OL]. (2022-07-03)[2006-03]. [11] NIU X, SUN X R, WANG H F, et al. Zhishi. me—Weaving Chinese linking open data[C]//International Semantic Web Conference. Berlin: Springer, 2011: 205-220. [12] XU B, LIANG J Q, XIE C H, et al. CN-DBpedia2: An extraction and verification framework for enriching Chinese encyclopedia knowledge base[J]. Data Intelligence, 2019, 1(3): 271-288. doi: 10.1162/dint_a_00017 [13] GUPTA S, KENKRE S, TALUKDAR P. CaRe: Open knowledge graph embeddings[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg: Association for Computational Linguistics, 2019: 378-388. [14] 中国电子技术标准化研究院. 知识图谱标准化白皮书[R/OL]. (2023-07-03)[2019-09-11].China Electronics Standardization Institute. Knowledge Graph standardization white paper[R/OL]. (2022-07-03) [2019-09-11]. [15] 刘烨宸, 李华昱. 领域知识图谱研究综述[J]. 计算机系统应用, 2020, 29(6): 1-12.LIU Y C, LI H Y. Survey on domain knowledge graph research[J]. Computer Systems & Applications, 2020, 29(6): 1-12(in Chinese). [16] 吴建军, 朱晓彬, 程玉强, 等. 液体火箭发动机智能健康监控技术研究进展[J]. 推进技术, 2022, 43(1): 7-19. doi: 10.13675/j.cnki.tjjs.200668WU J J, ZHU X B, CHENG Y Q, et al. Research progress of intelligent health monitoring technology for liquid-propellant rocket engines[J]. Journal of Propulsion Technology, 2022, 43(1): 7-19(in Chinese). doi: 10.13675/j.cnki.tjjs.200668 [17] 刘丹阳, 方全, 张晓伟, 等. 基于图对比注意力网络的知识图谱补全[J/OL]. 北京航空航天大学学报, 2022, 48(8): 1428-1435.LIU D Y, FANG Q, ZHANG X W, et al. Knowledge graph completion based on graph contrastive attention network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1428-1435(in Chinese). [18] 李涓子, 侯磊. 知识图谱研究综述[J]. 山西大学学报(自然科学版), 2017, 40(3): 454-459.LI J Z, HOU L. Reviews on knowledge graph research[J]. Journal of Shanxi University (Natural Science Edition), 2017, 40(3): 454-459(in Chinese). [19] 曹高辉, 焦玉英, 成全. 基于凝聚式层次聚类算法的标签聚类研究[J]. 现代图书情报技术, 2008(4): 23-28.CAO G H, JIAO Y Y, CHENG Q. Research on tag cluster based on hierarchical agglomerative clustering algorithm[J]. New Technology of Library and Information Service, 2008(4): 23-28(in Chinese). [20] 林婧, 何震瀛. 基于广义后缀树结合过滤因子的正则表达式匹配算法[J]. 计算机应用与软件, 2022, 39(1): 266-270.LIN J, HE Z Y. Regular expression matching algorithm based on generalized suffix tree combine filter factor[J]. Computer Applications and Software, 2022, 39(1): 266-270(in Chinese). [21] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[EB/OL]. (2006-01-24)[2022-07-03]2018: arXiv: 1810.04805. [22] 张华, 叶娜, 周俏丽, 等. 基于分类策略的术语识别系统融合[J]. 小型微型计算机系统, 2015, 36(2): 385-390.ZHANG H, YE N, ZHOU Q L, et al. Classification strategy based term recognition systems combination[J]. Journal of Chinese Computer Systems, 2015, 36(2): 385-390(in Chinese). [23] LEVY R, MANNING C. Is it harder to parse Chinese, or the Chinese Treebank? [C]//Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-ACL '03. Morristown: Association for Computational Linguistics, 2003: 439-446. [24] PASZKE A, GROSS S, MASSA F, et al. PyTorch: An imperative style, high-performance deep learning library[EB/OL]. (2022-07-03)[2019-12-03]. [25] 黄勋, 游宏梁, 于洋. 关系抽取技术研究综述[J]. 现代图书情报技术, 2013(11): 30-39.HUANG X, YOU H L, YU Y. A review of relation extraction[J]. New Technology of Library and Information Service, 2013(11): 30-39(in Chinese).