北京航空航天大学学报 ›› 2018, Vol. 44 ›› Issue (1): 132-141.doi: 10.13700/j.bh.1001-5965.2017.0031

• 论文 • 上一篇    下一篇

纠错输出编码的留一误差界估计

薛爱军, 王晓丹   

  1. 空军工程大学 防空反导学院, 西安 710051
  • 收稿日期:2017-01-17 修回日期:2017-05-12 出版日期:2018-01-20 发布日期:2018-01-29
  • 通讯作者: 王晓丹 E-mail:wang_afeu@126.com
  • 作者简介:薛爱军,男,博士研究生。主要研究方向:模式识别;王晓丹,女,教授,博士生导师。主要研究方向:机器学习。
  • 基金资助:
    国家自然科学基金(61273275,61703426)

Leave-one-out error bounds estimation for error correcting output codes

XUE Aijun, WANG Xiaodan   

  1. Air and Missile Defense College, Air Force Engineering University, Xi'an 710051, China
  • Received:2017-01-17 Revised:2017-05-12 Online:2018-01-20 Published:2018-01-29

摘要: 纠错输出编码(ECOC)作为分解框架,将多类分类问题转化为二类分类问题,是解决多类分类问题的有效手段。为了提高ECOC的泛化性能,对ECOC基分类器的设计问题进行了研究。解决这一问题的关键是对ECOC的泛化性能进行估计。留一(LOO)误差作为泛化性能的无偏估计,研究了ECOC留一误差界的估计问题。先给出了ECOC留一误差的定义,基于此定义,再给出了基分类器为支持向量机(SVM),解码方法为线性损失函数解码时,ECOC留一误差的上界和下界。在人工数据集和UCI数据集上的实验表明,ECOC留一误差的上界可以指导基分类器的参数选择,通过基分类器设计可以提高ECOC的泛化性能。此外,ECOC的训练误差可以作为ECOC留一误差的下界,对ECOC留一误差下界的研究可以作为未来的研究方向。

关键词: 模式识别, 多类分类, 纠错输出编码(ECOC), 泛化性能(LOO), 留一误差

Abstract: Error correcting output codes (ECOC) is a decomposition framework, which can transform a complex multiclass classification problem into a series of two-class classification problems. It can complete one multiclass classification task efficiently. To improve its generalization performance, we studied the design of its base classifier, which is also known as model selection in ECOC. The key point is how to estimate the generalization error of ECOC. Leave-one-out (LOO) error is an almost unbiased estimator of generalization error, so we studied how to estimate the LOO error bounds for ECOC. First, we provided the definition of LOO error for ECOC. And then, based on this definition, upper bound and lower bound of LOO error for ECOC was given under the condition that base classifiers were support vector machines (SVM) and decoding method was linear loss function. The experiments on synthetic dataset and UCI dataset show that the upper bound of LOO error for ECOC leads to good estimates of parameters in base classifiers, and designing base classifiers can improve the generalization performance of ECOC. Furthermore, we also report that training error is one lower bound of LOO error for ECOC, and the application of this lower bound should be studied in the future.

Key words: pattern recognition, multiclass classification, error correcting output codes (ECOC), generalization performance, leave-one-out (LOO) error

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发