Fault case information extraction method research based on ontology
-
摘要: 以飞机维修保障中的经验知识积累和重用为目的,针对故障案例知识由于缺乏结构化、规范化描述,导致共享与重用困难的问题,对飞机故障案例的知识表达与信息抽取方法进行了研究.首先,根据飞机故障领域的特殊性以及知识共享和重用的实际需求,建立了飞机故障案例知识的本体模型;其次,利用中文分词工具以及文本工程通用框架(GATE),研究了对故障案例信息文档的语义标注以及基于规则的信息抽取技术;最后,利用Jena推理机挖掘出隐性信息,并实现在信息抽取过程中,通过不断发现新知识,主动扩展知识库.在此基础上开发了信息抽取原型系统,实现了从多种不同类型的文档信息中抽取出结构化故障案例信息,并利用数据库进行存储和管理,提高了故障案例知识的重用性,验证了研究方法的可行性.Abstract: To solve the accumulation and reusing problems of fault case knowledge that are described as unstructured and unnormalized information in the current maintenance support activities of aircraft, research on the knowledge representation and information extraction method of aircraft fault case was carried out. Firstly,ontology model of aircraft fault case knowledge was established according to the particularity of aircraft fault domain and the actual demand of knowledge sharing and reusing. Then with Chinese segmentation tools and general architecture for text engineering (GATE) frame, semantic annotation and rule based information extraction technology of aircraft fault case documents were studied. Finally, the hidden knowledge was discovered by using apache Jena inference engine, and knowledge base was expanded by the new knowledge found in the process of information extraction. Moreover, the prototype system for information extraction was developed and was used to extract structured fault case information from different types of documents, the information was then stored and managed by using database. This method was proved feasible to improve the reusability of fault case knowledge.
-
[1] 陈劲.面向中文网页的信息抽取关键技术研究与实现[D].杭州:浙江大学, 2013. Chen J.Research and implementation on Chinese web pages-oriented information extraction technologies[D].Hangzhou:Zhejiang University, 2013(in Chinese). [2] Shahzadi I, Ahmad Q, Fatima K, et al.UMagic!THE UML modeler for text documents[C]//Proceeding of 2011 3rd IEEE International Conference on Information Management and Engineering.Piscataway, NJ:IEEE Press, 2011, 5:253-256. [3] 李飒.基于GATE的中文信息抽取系统的开发与实现[D].北京:中国科学院, 2006. Li S.The implementation of the Chinese information extraction system based on GATE[D].Beijing:Chinese Academy of Science, 2006(in Chinese). [4] 徐东兴.基于GATE框架的信息抽取系统的研究与实现[D].上海:华东师范大学, 2007. Xu D X.A GATE-based information extraction system:Research and implementation[D].Shanghai:East China Normal University, 2007(in Chinese). [5] 原欢.基于GATE的货物动态邮件信息抽取方法与应用研究[D].南京:南京航空航天大学, 2013. Yuan H.GATE based cargo dynamic E-mail information extraction algorithm and implementation[D].Nanjing:Nanjing University of Aeronautics and Astronautics, 2013(in Chinese). [6] 陈静.基于本体的信息抽取研究[D].苏州:苏州大学, 2007. Chen J.Research of ontology-based information extraction[D].Suzhou:Soochow University, 2007(in Chinese). [7] Seneviratne M D S.Use of agent technology in relation extraction for ontology construction[C]//Proceedings of 2011 4th IEEE International Conference on Computer Science and Information Technology.Piscataway, NJ:IEEE Press, 2011, 6:70-76. [8] 张志雄, 吴振新, 刘建华, 等.当前知识抽取的主要技术方法解析[J].现代图书情报技术, 2008(8):1-11. Zhang Z X, Wu Z X, Liu J H, et al.Analysis of state-of-the-art knowledge extraction technologies[J].New Technology of Library and Information Service, 2008(8):1-11(in Chinese). [9] 杨威.基于正则表达式的Web信息抽取系统的研究与实现[D].西安:西安电子科技大学, 2011. Yang W.The research and implementation of Web information extraction system based on the regular expression[D].Xi'an:Xidian University, 2011(in Chinese). [10] Cunningham H, Maynard D, Bontcheva K, et al.Developing language processing components with GATE version[EB/OL].2014-05-20[2014-06-10].http://gate.ac.uk/sale/tao/split.html. [11] 谭月辉, 肖冰, 陈建泗, 等.Jena推理机制及应用研究[J].河北省科技院学报, 2009, 26(4):14-17. Tan Y H, Xiao B, Chen J S, et al.The suvery of Jena's reasoning and applying[J].Journal of the Hebei Academy of Sciences, 2009, 26(4):14-17(in Chinese). [12] 杨柳.模糊本体建模方法及语义信息处理策略研究[D].长沙:中南大学, 2011. Yang L.Fuzzy ontology modeling methods and semantic information processing strategies[D].Changsha:Central South University, 2011(in Chinese). [13] 穆一夫.基于认知的非结构化信息抽取关键技术与算法研究[D].北京:中国矿业大学, 2013. Mu Y F.Research on key technology and algorithms in unstructured information extraction based on cognition[D].Beijing:China University of Mining and Technology, 2013(in Chinese). [14] 黄风华, 晏路明.基于Jena的台风灾害领域本体模型推理[J].计算机应用, 2013, 33(3):771-775. Huang F H, Yan L M.Reasoning of ontology model for typhoon disaster domain based on Jena[J].Journal of Computer Applications, 2013, 33(3):771-775(in Chinese). [15] Yang S, Wang X P, Wu G.Analysis of semantic query performance for Jena-based storage model[C]//Proceeding 2010 IEEE International Conference on Software Engineer and Service Sciences.Piscataway, NJ:IEEE Press, 2010:553-556. [16] 张奇.信息抽取中实体关系识别研究[D].合肥:中国科学技术大学, 2010. Zhang Q.Research on entity relation recognition in information extraction[D].Hefei:University of Science and Technology of China, 2010(in Chinese).
点击查看大图
计量
- 文章访问数: 950
- HTML全文浏览量: 28
- PDF下载量: 704
- 被引次数: 0