-
摘要:
针对含有函数型和多元向量数据的回归模型中变量选择和参数估计问题进行研究,扩展了函数型数据分析和变量选择方法的应用范围。首先,函数型自变量基于函数型主成分基函数空间进行投影;然后,对投影后的函数型自变量(按组)及多元向量自变量采用惩罚变量选择方法,同时估计相应的系数。惩罚项调节参数采用自适应调节参数,损失函数采用中位绝对损失函数,以此为例,通过引入松弛变量将估计算法转化为求解线性规划问题,算法复杂度低。数值模拟结果表明,所提方法对于含函数型自变量回归模型的变量选择和参数估计均具有良好效果。
Abstract:The variable selection and parameter estimation problem is researched in the framework of mixed-type regression model with both functional and multivariate predictors, which broadens the scope of functional data analysis and the application fields of variable selection methodology. First the functional predictors are projected into spaces spanned by functional principal component basis functions. Then variable selection and parameter estimation are implemented simultaneously for the multivariate predictors and derived projection predictors in the form of grouping, where the tuning parameter of the penalized term is adaptively selected and the loss function is based on absolute median loss function. As to the optimization procedure, by introducing slack variables, it is transformed into a linear programming problem with several constraint conditions, which simplifies the computation. The simulation results illustrate that the proposed method performs quite well in variable selection and parameter estimation in the mixed-type regression model.
-
Key words:
- functional data /
- variable selection /
- parameter estimation /
- quantile /
- functional principal component
-
表 1 正态误差下的数据模拟结果
Table 1. Data simulation results with normal error
(n, σ) 统计指标 TP FP RMSE Bias (100, 0.05) Mean 2 0.22 0.028 2 0.005 8 Sd 0 0.52 0.007 6 0.004 4 (100, 0.2) Mean 2 0.34 0.084 4 0.022 9 Sd 0 0.61 0.033 0 0.017 9 (300, 0.05) Mean 2 0.09 0.016 8 0.002 7 Sd 0 0.30 0.004 8 0.002 0 (300, 0.2) Mean 2 0.18 0.049 1 0.012 0 Sd 0 0.42 0.019 5 0.009 8 表 2 柯西误差下的数据模拟结果
Table 2. Data simulation results with Cauchy error
(n, σ) 统计指标 TP FP RMSE Bias (100, 0.05) Mean 2 0.01 0.036 0 0.008 3 Sd 0 0.07 0.007 6 0.004 4 (100, 0.2) Mean 2 0.03 0.116 8 0.035 5 Sd 0 0.16 0.054 7 0.030 1 (300, 0.05) Mean 2 0 0.019 5 0.003 8 Sd 0 0 0.006 5 0.002 8 (300, 0.2) Mean 2 0.12 0.062 1 0.014 0 Sd 0 0.32 0.026 6 0.011 6 -
[1] FERRATY F.Recent advances in functional data analysis and related topics[M].Berlin:Springer, 2011. [2] CHEN S T, XIAO L, STAICU A M.A smoothing-based goodness-of-fit test of covariance for functional data[J].Biometrics, 2018, 75(2):562-571. [3] CUEVAS A.A partial overview of the theory of statistics with functional data[J].Journal of Statistical Planning and Inference, 2014, 147:1-23. doi: 10.1016/j.jspi.2013.04.002 [4] PARK J, AHN J.Clustering multivariate functional data with phase variation[J].Biometrics, 2017, 73(1):324-333. doi: 10.1111/biom.12546 [5] KATO K.Estimation in functional linear quantile regression[J].Annals of Statistics, 2012, 40(6):3108-3136. doi: 10.1214/12-AOS1066 [6] TIBSHIRANI R.Regression shrinkage and selection via the Lasso[J].Journal of the Royal Statistical Society.Series B(Statistical Methodology), 1996, 58(1):267-288. [7] HALL P, HOROWITZ J L.Methodology and convergence rates for functional linear regression[J].Annals of Statistics, 2007, 35(1):70-91. [8] HALL P, HOSSEINI-NASAB M.On properties of functional principal components analysis[J].Journal of the Royal Statistical Society.Series B(Statistical Methodology), 2005, 68(1):109-126. [9] LIN X, LU T, YAN F, et al.Mean residual life regression with functional principal component analysis on longitudinal data for dynamic prediction[J].Biometrics, 2018, 74(4):1482-1491. doi: 10.1111/biom.12876 [10] HUANG L, ZHAO J, WANG H, et al.Robust shrinkage estimation and selection for functional multiple linear model through LAD loss[J].Computational Statistics & Data Analysis, 2016, 103:384-400. [11] QIAN J, SU L.Shrinkage estimation of common breaks in panel data models via adaptive group fused Lasso[J].Journal of Econometrics, 2016, 191(1):86-109. doi: 10.1016/j.jeconom.2015.09.004 [12] VINCENT M, HANSEN N R.Sparse group lasso and high dimensional multinomial classification[J].Computational Statistics & Data Analysis, 2014, 71:771-786. [13] LIU X, LIN Y, WANG Z.Group variable selection for relative error regression[J].Journal of Statistical Planning and Inference, 2016, 175:40-50. doi: 10.1016/j.jspi.2016.02.006 [14] WANG H J, LI D, HE X.Estimation of high conditional quantiles for heavy-tailed distributions[J].Journal of the American Statistical Association, 2012, 107(500):1453-1464. doi: 10.1080/01621459.2012.716382 [15] BANG S, JHUN M.Simultaneous estimation and factor selection in quantile regression via adaptive sup-norm regularization[J].Computational Statistics & Data Analysis, 2012, 56(4):813-826. [16] WANG T, ZHU L.Consistent tuning parameter selection in high dimensional sparse linear regression[J].Journal of Multivariate Analysis, 2011, 102(7):1141-1151. doi: 10.1016/j.jmva.2011.03.007 [17] HIROSE K, TATEISHI S, KONISHI S.Tuning parameter selection in sparse regression modeling[J].Computational Statistics & Data Analysis, 2013, 59:28-40. 期刊类型引用(8)
1. 徐红,矫桂娥,张文俊,陈一民. 基于卷积神经网络的结构化非平衡数据分类算法. 计算机工程. 2023(02): 81-89 . 百度学术
2. 王萌铎,续欣莹,阎高伟,史丽娟,郭磊. 基于AdaBoost集成加权宽度学习系统的不平衡数据分类. 计算机工程. 2022(04): 99-105+112 . 百度学术
3. 张利剑,陈晋鹏. 基于扩展Jarvis-Patrick聚类的异常检测算法优化及检测仿真. 电子设计工程. 2022(13): 100-104 . 百度学术
4. 张伊扬,钱育蓉,陶文彬,冷洪勇,李自臣,马梦楠. 基于深度学习的属性图异常检测综述. 计算机工程与应用. 2022(19): 1-13 . 百度学术
5. 苏江军,董一鸿,颜铭江,钱江波,辛宇. 面向复杂网络的异常检测研究进展. 控制与决策. 2021(06): 1293-1310 . 百度学术
6. 陈波冯,李靖东,卢兴见,沙朝锋,王晓玲,张吉. 基于深度学习的图异常检测技术综述. 计算机研究与发展. 2021(07): 1436-1455 . 百度学术
7. 张建宁. 基于改进动态图算法的软件保护技术. 科技通报. 2021(08): 56-60 . 百度学术
8. 吴德胜,管媛辉. 移动互联网异常入侵行为下攻击意图预测仿真. 计算机仿真. 2018(12): 241-244 . 百度学术
其他类型引用(5)
-

计量
- 文章访问数: 969
- HTML全文浏览量: 126
- PDF下载量: 363
- 被引次数: 13