A varying coefficient geographically weighted spatial lag model for compositional data
-
摘要:
针对已有模型无法刻画面类型空间依赖下成分数据的空间异质性,提出模型参数可变的成分数据空间自回归模型。通过假定空间滞后参数、成分型系数、数值型系数为位置坐标的函数,允许空间效应和变量关系在全局空间上非均匀分布。基于等距对数比(ilr)变换、工具变量法和局部线性地理加权法,对模型参数进行估计。数值模拟实验表明:所提出模型的表现优于已有的成分数据空间自回归模型,并且参数估计量是有效的。基于一组实际数据,说明所提模型的实用性。
Abstract:When it comes to area data with compositional factors, existing regression models seldom ever take spatial heterogeneity into account. To solve the problem, a compositional spatial autoregressive model with varying coefficients is proposed. By assuming that the spatial lag parameter, the compositional coefficient, and the numerical coefficient are functions of the location coordinates, the new model permits spatial effects and linear interactions between covariates and response to change in space. Based on isometric log-ratio (ILR) transformation, instrumental variables and local linear geographically weighted method, the parameters are estimated. The simulation study shows that the proposed model is superior to the existing spatial autoregressive model for compositional data, and the parameters estimation are effective. The utility of the proposed model is demonstrated by a real data set.
-
表 1 空间参数和数值型系数的RMSE和SSD
Table 1. RMSE and SSD for the spatial lag parameter and numerical coefficient
情形 n RMSE SSD 本文模型 已有模型[11] 本文模型 已有模型[11] $ \rho ({u_i},{v_i}) $ $ \beta ({u_i},{v_i}) $ $ \rho ({u_i},{v_i}) $ $ \beta ({u_i},{v_i}) $ $ \rho ({u_i},{v_i}) $ $ \beta ({u_i},{v_i}) $ $ \rho ({u_i},{v_i}) $ $ \beta ({u_i},{v_i}) $ 情形1 49 0.010 0.010 −0.005 0.001 0.16 0.22 0.03 0.04 144 0.008 0.006 −0.003 0.000 0.08 0.10 0.01 0.02 441 0.009) 0.004 −0.001 0.000 0.05 0.06 0.01 0.01 情形2 49 0.083 0.008 0.312 0.134 0.26 0.27 0.20 0.23 144 0.023 0.006 0.280 0.126 0.11 0.11 0.11 0.13 441 0.014 0.003 0.266 0.121 0.07 0.06 0.06 0.07 表 2 成分系数各成分的RMSE和整个成分系数的SSD
Table 2. RMSE for each component and mean SSD for the entire compositional coefficient
情形 n RMSE SSD 本文模型 已有模型[11] 本文模型 已有模型[11] $ \beta _1^D({u_i},{v_i}) $ $ \beta _2^D({u_i},{v_i}) $ $ \beta _3^D({u_i},{v_i}) $ $ \beta _1^D({u_i},{v_i}) $ $ \beta _2^D({u_i},{v_i}) $ $ \beta _3^D({u_i},{v_i}) $ $ \beta _2^D({u_i},{v_i}) $ $ \beta _2^D({u_i},{v_i}) $ 情形1 49 0.003 0.003 0.001 0.001 0.000 −0.001 0.050 0.009 144 0.002 0.001 0.001 −0.001 0.000 0.000 0.024 0.005 441 0.001 0.001 0.001 0.000 0.000 0.000 0.014 0.003 情形2 49 0.007 0.012 0.018 0.092 0.148 0.191 0.056 0.057 144 0.004 0.007 0.011 0.088 0.144 0.187 0.023 0.032 441 0.002 0.004 0.007 0.086 0.142 0.184 0.012 0.018 表 3 本文模型的回归结果
Table 3. Regression results for the proposed model
城市 $ \rho $ $ \beta _1^D $ $ \beta _2^D $ $ \beta _3^D $ $ \beta $ 兴安盟 −0.10 1.09 −0.16 −3.42 3.61 通辽 −0.31 0.85 −1.31 −4.04 2.58 锡林郭勒 0.22 1.30 1.37 4.85 0.63 赤峰 −0.09 0.66 0.85 −1.26 0.38 张家口 0.12 0.32 −0.44 0.41 0.20 包头 0.13 −0.04 2.25 3.86 −0.21 呼和浩特 0.00 0.24 1.56 2.35 0.00 乌兰察布 0.05 0.16 1.05 2.50 0.04 鄂尔多斯 −0.09 0.09 1.63 3.21 −0.10 巴彦淖尔 0.75 0.84 0.61 6.50 −1.10 乌海 0.53 −0.73 0.61 3.83 −1.13 阿拉善盟 0.24 −0.70 1.36 2.51 −1.41 承德 −0.12 0.31 0.28 −2.49 0.13 北京 −0.03 0.06 −0.76 −1.09 0.01 天津 −0.14 −0.20 −0.56 −1.69 −0.20 唐山 −0.20 −0.18 0.26 −2.38 −0.12 秦皇岛 −0.34 −0.29 1.46 −3.03 −0.02 廊坊 −0.07 −0.05 −0.75 −1.32 −0.08 保定 −0.05 −0.10 −1.38 −0.20 −0.03 沧州 −0.17 −0.36 −0.90 −1.72 −0.34 衡水 −0.12 −0.38 −1.52 −0.91 −0.27 邢台 −0.11 −0.43 −1.52 −0.35 −0.25 邯郸 −0.19 −0.56 −1.27 −0.89 −0.42 长治 −0.24 −0.58 −0.07 −0.71 −0.33 晋城 −0.40 −0.75 1.29 −0.50 −0.66 运城 −0.67 −0.77 4.20 2.09 −0.63 临汾 −0.32 −0.56 1.18 0.43 −0.12 吕梁 −0.09 −0.17 −0.29 1.86 0.22 晋中 0.01 −0.14 −1.17 1.57 0.18 太原 0.02 −0.09 −1.09 1.81 0.22 阳泉 0.01 −0.15 −1.47 1.19 0.10 忻州 0.02 0.02 −0.98 2.00 0.26 石家庄 −0.03 −0.19 −1.62 0.51 0.00 朔州 −0.04 0.19 −0.10 2.06 0.25 大同 0.05 0.23 −0.06 1.53 0.20 -
[1] FERRERS N M. An elementary treatise on trilinear co-ordinates: The method of reciprocal polars, and the theory of projections[M]. New York: Macmillan and Company, 1876. [2] VAN DEN BOOGAART K G, FILZMOSER P, HRON K, et al. Classical and robust regression analysis with compositional data[J]. Mathematical Geosciences, 2021, 53(5): 823-858. doi: 10.1007/s11004-020-09895-w [3] ARBORETTI GIANCRISTOFARO R, GASTALDI M, MARTINELLO L, et al. Regression analysis with compositional data using orthogonal log-ratio coordinates[J]. Communications in Statistics - Simulation and Computation, 2022, 51(4): 1932-1945. doi: 10.1080/03610918.2019.1691224 [4] 龚日朝, 姚嘉倩, 刘香伶. 产业结构数据的等距logratio变换与应用[J]. 统计与决策, 2023, 39(19): 53-59.GONG R Z, YAO J Q, LIU X L. Isometric logratio transformation of industrial structure data and its application[J]. Statistics & Decision, 2023, 39(19): 53-59 (in Chinese). [5] MISHRA A, MÜLLER C L. Robust regression with compositional covariates[J]. Computational Statistics & Data Analysis, 2022, 165: 107315. [6] YOO J, SUN Z Q, GREENACRE M, et al. A guideline for the statistical analysis of compositional data in immunology[J]. Communications for Statistical Applications and Methods, 2022, 29(4): 453-469. doi: 10.29220/CSAM.2022.29.4.453 [7] 龙文, 王惠文. 成分数据偏最小二乘Logistic回归模型及其应用[J]. 数量经济技术经济研究, 2006, 23(9): 156-161. doi: 10.3969/j.issn.1000-3894.2006.09.017LONG W, WANG H W. PLS logistic regression on compositional data and its application[J]. The Journal of Quantitative & Technical Economics, 2006, 23(9): 156-161 (in Chinese). doi: 10.3969/j.issn.1000-3894.2006.09.017 [8] 李玉莹, 张景肖. 成分数据的logistic回归模型研究[J]. 数理统计与管理, 2019, 38(3): 442-449.LI Y Y, ZHANG J X. Logistic regression model for compositional data[J]. Journal of Applied Statistics and Management, 2019, 38(3): 442-449 (in Chinese). [9] LIN W, SHI P X, FENG R, et al. Variable selection in regression with compositional covariates[J]. Biometrika, 2014, 101(4): 785-797. doi: 10.1093/biomet/asu031 [10] WANG H W, WANG Z C, WANG S S. Sliced inverse regression method for multivariate compositional data modeling[J]. Statistical Papers, 2021, 62(1): 361-393. doi: 10.1007/s00362-019-01093-z [11] MA X J, ZHANG P. Quantile regression for compositional covariates[J]. Communications in Statistics - Simulation and Computation, 2023, 52(3): 658-668. doi: 10.1080/03610918.2020.1862231 [12] HUANG S M, AILER E, KILBERTUS N, et al. Supervised learning and model analysis with compositional data[J]. PLoS Computational Biology, 2023, 19(6): e1011240. doi: 10.1371/journal.pcbi.1011240 [13] ALENAZI A. A review of compositional data analysis and recent advances[J]. Communications in Statistics - Theory and Methods, 2023, 52(16): 5535-5567. doi: 10.1080/03610926.2021.2014890 [14] CRESSIE N A C. Statistics for spatial data[M]. Hoboken: Wiley, 1993. [15] CLAROTTO L, ALLARD D, MENAFOGLIO A. A new class of α-transformations for the spatial analysis of compositional data[J]. Spatial Statistics, 2022, 47: 100570. doi: 10.1016/j.spasta.2021.100570 [16] 黄婷婷, 王惠文, GILBERT S. 成分数据的空间自回归模型[J]. 北京航空航天大学学报, 2019, 45(1): 93-98.HUANG T T, WANG H W, GILBERT S. Spatial autoregressive model for compositional data[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(1): 93-98 (in Chinese). [17] HUANG T T, SAPORTA G, WANG H W. A spatial durbin model for compositional data[C]//Proceedings of the Advances in Contemporary Statistics and Econometrics. Beilin: Springer, 2021: 471-488. [18] 曾梅. 带有缺失数据的混合空间自回归模型的统计分析[D]. 昆明: 云南大学, 2021.ZENG M. Analysis of mixed spatial autoregressive model with missing data[D]. Kunming: Yunnan University, 2021 (in Chinese). [19] PÁEZ A, UCHIDA T, MIYAMOTO K. A general framework for estimation and inference of geographically weighted regression models: 1. Location-specific kernel bandwidths and a test for locational heterogeneity[J]. Environment and Planning A: Economy and Space, 2002, 34(4): 733-754. doi: 10.1068/a34110 [20] BRUNSDON C, FOTHERINGHAM A S, CHARLTON M. Spatial nonstationarity and autoregressive models[J]. Environment and Planning A: Economy and Space, 1998, 30(6): 957-973. doi: 10.1068/a300957 [21] 林光平, 龙志和. 空间经济计量: 理论与实证[M]. 北京: 科学出版社, 2014.LIN G P, LONG Z H. Spatial econometrics[M]. Beijing: Science Press, 2014 (in Chinese). [22] PÁEZ A, UCHIDA T, MIYAMOTO K. A general framework for estimation and inference of geographically weighted regression models: 2. Spatial association and model specification tests[J]. Environment and Planning A: Economy and Space, 2002, 34(5): 883-904. doi: 10.1068/a34133 [23] 魏传华, 王韶郡, 苏宇楠. 空间变系数地理加权自回归模型的局部GMM估计[J]. 统计与信息论坛, 2022, 37(11): 3-13.WEI C H, WANG S J, SU Y N. Local GMM estimation in spatial varying coefficient geographically weighted autoregressive model[J]. Journal of Statistics and Information, 2022, 37(11): 3-13 (in Chinese). [24] EGOZCUE J J, PAWLOWSKY-GLAHN V, MATEU-FIGUERAS G, et al. Isometric logratio transformations for compositional data analysis[J]. Mathematical Geology, 2003, 35(3): 279-300. doi: 10.1023/A:1023818214614 [25] HRON K, FILZMOSER P, THOMPSON K. Linear regression with compositional explanatory variables[J]. Journal of Applied Statistics, 2012, 39(5): 1115-1128. doi: 10.1080/02664763.2011.644268