-
摘要:
针对含有函数型和多元向量数据的回归模型中变量选择和参数估计问题进行研究,扩展了函数型数据分析和变量选择方法的应用范围。首先,函数型自变量基于函数型主成分基函数空间进行投影;然后,对投影后的函数型自变量(按组)及多元向量自变量采用惩罚变量选择方法,同时估计相应的系数。惩罚项调节参数采用自适应调节参数,损失函数采用中位绝对损失函数,以此为例,通过引入松弛变量将估计算法转化为求解线性规划问题,算法复杂度低。数值模拟结果表明,所提方法对于含函数型自变量回归模型的变量选择和参数估计均具有良好效果。
Abstract:The variable selection and parameter estimation problem is researched in the framework of mixed-type regression model with both functional and multivariate predictors, which broadens the scope of functional data analysis and the application fields of variable selection methodology. First the functional predictors are projected into spaces spanned by functional principal component basis functions. Then variable selection and parameter estimation are implemented simultaneously for the multivariate predictors and derived projection predictors in the form of grouping, where the tuning parameter of the penalized term is adaptively selected and the loss function is based on absolute median loss function. As to the optimization procedure, by introducing slack variables, it is transformed into a linear programming problem with several constraint conditions, which simplifies the computation. The simulation results illustrate that the proposed method performs quite well in variable selection and parameter estimation in the mixed-type regression model.
-
Key words:
- functional data /
- variable selection /
- parameter estimation /
- quantile /
- functional principal component
-
表 1 正态误差下的数据模拟结果
Table 1. Data simulation results with normal error
(n, σ) 统计指标 TP FP RMSE Bias (100, 0.05) Mean 2 0.22 0.028 2 0.005 8 Sd 0 0.52 0.007 6 0.004 4 (100, 0.2) Mean 2 0.34 0.084 4 0.022 9 Sd 0 0.61 0.033 0 0.017 9 (300, 0.05) Mean 2 0.09 0.016 8 0.002 7 Sd 0 0.30 0.004 8 0.002 0 (300, 0.2) Mean 2 0.18 0.049 1 0.012 0 Sd 0 0.42 0.019 5 0.009 8 表 2 柯西误差下的数据模拟结果
Table 2. Data simulation results with Cauchy error
(n, σ) 统计指标 TP FP RMSE Bias (100, 0.05) Mean 2 0.01 0.036 0 0.008 3 Sd 0 0.07 0.007 6 0.004 4 (100, 0.2) Mean 2 0.03 0.116 8 0.035 5 Sd 0 0.16 0.054 7 0.030 1 (300, 0.05) Mean 2 0 0.019 5 0.003 8 Sd 0 0 0.006 5 0.002 8 (300, 0.2) Mean 2 0.12 0.062 1 0.014 0 Sd 0 0.32 0.026 6 0.011 6 -
[1] FERRATY F.Recent advances in functional data analysis and related topics[M].Berlin:Springer, 2011. [2] CHEN S T, XIAO L, STAICU A M.A smoothing-based goodness-of-fit test of covariance for functional data[J].Biometrics, 2018, 75(2):562-571. http://cn.bing.com/academic/profile?id=db7400a5bd4adec6d3ad20b631b41138&encoded=0&v=paper_preview&mkt=zh-cn [3] CUEVAS A.A partial overview of the theory of statistics with functional data[J].Journal of Statistical Planning and Inference, 2014, 147:1-23. doi: 10.1016/j.jspi.2013.04.002 [4] PARK J, AHN J.Clustering multivariate functional data with phase variation[J].Biometrics, 2017, 73(1):324-333. doi: 10.1111/biom.12546 [5] KATO K.Estimation in functional linear quantile regression[J].Annals of Statistics, 2012, 40(6):3108-3136. doi: 10.1214/12-AOS1066 [6] TIBSHIRANI R.Regression shrinkage and selection via the Lasso[J].Journal of the Royal Statistical Society.Series B(Statistical Methodology), 1996, 58(1):267-288. http://d.old.wanfangdata.com.cn/OAPaper/oai_pubmedcentral.nih.gov_3410531 [7] HALL P, HOROWITZ J L.Methodology and convergence rates for functional linear regression[J].Annals of Statistics, 2007, 35(1):70-91. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_0708.0466 [8] HALL P, HOSSEINI-NASAB M.On properties of functional principal components analysis[J].Journal of the Royal Statistical Society.Series B(Statistical Methodology), 2005, 68(1):109-126. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_physics%2f9811014 [9] LIN X, LU T, YAN F, et al.Mean residual life regression with functional principal component analysis on longitudinal data for dynamic prediction[J].Biometrics, 2018, 74(4):1482-1491. doi: 10.1111/biom.12876 [10] HUANG L, ZHAO J, WANG H, et al.Robust shrinkage estimation and selection for functional multiple linear model through LAD loss[J].Computational Statistics & Data Analysis, 2016, 103:384-400. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=34aa8b628f3763fa2f1eca7e0f4904e8 [11] QIAN J, SU L.Shrinkage estimation of common breaks in panel data models via adaptive group fused Lasso[J].Journal of Econometrics, 2016, 191(1):86-109. doi: 10.1016/j.jeconom.2015.09.004 [12] VINCENT M, HANSEN N R.Sparse group lasso and high dimensional multinomial classification[J].Computational Statistics & Data Analysis, 2014, 71:771-786. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0232316882/ [13] LIU X, LIN Y, WANG Z.Group variable selection for relative error regression[J].Journal of Statistical Planning and Inference, 2016, 175:40-50. doi: 10.1016/j.jspi.2016.02.006 [14] WANG H J, LI D, HE X.Estimation of high conditional quantiles for heavy-tailed distributions[J].Journal of the American Statistical Association, 2012, 107(500):1453-1464. doi: 10.1080/01621459.2012.716382 [15] BANG S, JHUN M.Simultaneous estimation and factor selection in quantile regression via adaptive sup-norm regularization[J].Computational Statistics & Data Analysis, 2012, 56(4):813-826. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=0bd0ee21882eb569122d6a81ac6c667a [16] WANG T, ZHU L.Consistent tuning parameter selection in high dimensional sparse linear regression[J].Journal of Multivariate Analysis, 2011, 102(7):1141-1151. doi: 10.1016/j.jmva.2011.03.007 [17] HIROSE K, TATEISHI S, KONISHI S.Tuning parameter selection in sparse regression modeling[J].Computational Statistics & Data Analysis, 2013, 59:28-40. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1109.2411
计量
- 文章访问数: 920
- HTML全文浏览量: 123
- PDF下载量: 362
- 被引次数: 0