Application of space time regional economy visualization based on telecom big data analysis
-
摘要:
当前,国内移动电话用户已达15.9亿,在巨大的用户基数下,电信大数据呈现的特征在一定程度上反映了人群活动特征,进一步能够反映特定区域的发展状况。时空区域经济可视化应用利用数据挖掘技术对电信大数据进行处理和提取,以提高数据质量,并对数据进行不同规则的筛选,通过建模技术进行分析,结合电子地图数据、交通数据等多源信息,多角度分析用户行为特征。该应用分析对时空区域经济状况进行可视化研究,分析居民生活属性,同时,利用双重差分(DID)统计模型对区域经济政策进行评价。基于特征分析结果,为区域经济发展热点选址、指导城市商圈布局提供决策依据,提高了城市系统运行的效率,扩大了经济区域效益范围。
Abstract:Currently, the number of mobile phone users in China has reached 1.59 billion. Under the huge population base, the telecom big data characteristics reflect the characteristics of crowd activities to a certain extent, which can reflect the development status of specific regions. The application of space time regional economy visualization processes and extracts the information from massive telecom big data by data mining technology to improve data quality and screens the data in different rules and extracts features by modeling techniques of the data. The data combined with multi-source information, such as electronic map data and traffic data are used to analyze user behavior characteristics from multiple perspectives. The application analysis makes use of the data to visualize and research the space time regional economic situation and analyze life attributes of the residents. At the same time, use the difference-in-differences (DID) model to evaluate regional economic policies. Based on the results of feature analysis, it can provide the decision-making basis for the location of regional economic development and guiding layout of urban business districts, improve the efficiency of urban system operation and expand the range of economic regional benefits.
-
Key words:
- telecom big data /
- mobile phone signaling /
- data mining /
- regional economy /
- visualization
-
表 1 电信大数据基本格式
Table 1. Basic format of telecom big data
序号 字段名称 字段描述 1 USER_ID 用户标识 2 ULI 基站标识码 3 LAC 位置区域编号 4 CI 位置小区编号 5 APPEAR_TIME 时间戳 6 DURATION 停留时间 7 HOMECODE 用户归属地 8 AREACODE 手机所在地 9 DATE 时间日期 10 HOUR 小时 11 EVENT_FLAG 事件标识 表 2 实验环境
Table 2. Experimental environment
服务器 配置 系统 CentOS 7.3 内存 256 GB 磁盘 7.3TB×6 CPU Intel(R) Xeon(R) Gold 5118 CPU @ 2.30 GHz 开发语言 Java、Python、Scala等 系统技术框架 Spark集群计算框架、HDFS文件存储框架、Hive、Hbase、Redis、Kafka等 表 3 分析结果
Table 3. Analysis results
变量符号 变量名称 模型一 模型二 β0 截距 36 460.2 27 730.8*** β1 温度 -1 633.7* δ1 假期 12 974.2** δ2 夜间 -9 066* -9 066.3* δ3 政策影响 621.5* 621’ 注:“’”表示p < 0.1, “*”表示p < 0.05, “**”表示p < 0.01, “***”表示p < 0.001。 -
[1] SUN J B, YUAN J, WANG Y, et al. Exploring space-time structure of human mobility in urban space[J]. Physica A: Statistical Mechanics and its Applications, 2011, 390(5): 929-942. doi: 10.1016/j.physa.2010.10.033 [2] 罗名海, 秦思娴, 谭波, 等. 数据视角下的武汉市综合交通特征分析[J]. 地理空间信息, 2020, 18(5): 1-7. doi: 10.3969/j.issn.1672-4623.2020.05.001LUO M H, QIN S X, TAN B, et al. Characteristic analysis of the urban integrated transportation in Wuhan from the perspective of big data[J]. Geospatial Information, 2020, 18(5): 1-7(in Chinese). doi: 10.3969/j.issn.1672-4623.2020.05.001 [3] 唐维红, 唐胜宏, 刘志华. 中国移动互联网发展报告(2020)[M]. 北京: 社会科学文献出版社, 2020.TANG W H, TANG S H, LIU Z H. Annual report on China's mobile Internet development (2020)[M]. Beijing: Social Sciences Academic Press (China), 2020(in Chinese). [4] WANG H Y, CALABRESE F, DI LORENZO G, et al. Transportation mode inference from anonymized and aggregated mobile phone call detail records[C]//13th International IEEE Conference on Intelligent Transportation Systems. Piscataway: IEEE Press, 2010: 318-323. [5] AGUILERA V, ALLIO S, BENEZECH V, et al. Using cell phone data to measure quality of service and passenger flows of Paris transit system[J]. Transportation Research Part C: Emerging Technologies, 2014, 43(1): 198-211. http://www.sciencedirect.com/science?_ob=ShoppingCartURL&_method=add&_eid=1-s2.0-S0968090X13002349&originContentFamily=serial&_origin=article&_ts=1438269119&md5=5f820f96b7174d804179e266752c8807 [6] 徐鹏飞. 基于夜间灯光遥感影像的城镇化时空特征研究[D]. 杭州: 浙江大学, 2019.XU P F. The detection of spatio-temporal characteristics of urbanization based on the nighttime light data[D]. Hangzhou: Zhejiang University, 2019(in Chinese). [7] 张国亮, 朱瑞飞, 杜一博, 等. 吉林一号高分辨率夜光遥感影像在城市监测中的应用[J]. 卫星应用, 2020(3): 27-33. doi: 10.3969/j.issn.1674-9030.2020.03.007ZHANG G L, ZHU R F, DU Y B, et al. Application of Jilin-1 high-resolution luminous remote sensing images in urban monitoring[J]. Satellite Application, 2020(3): 27-33(in Chinese). doi: 10.3969/j.issn.1674-9030.2020.03.007 [8] 海晓东, 刘云舒, 赵鹏军, 等. 基于手机信令数据的特大城市人口时空分布及其社会经济属性估测: 以北京市为例[J]. 北京大学学报(自然科学版), 2020, 56(3): 518-530. https://www.cnki.com.cn/Article/CJFDTOTAL-BJDZ202003014.htmHAI X D, LIU Y S, ZHAO P J, et al. Using mobile phone data to estimate the temporal-spatial distribution and socioeconomic attributes of population in megacities: A case study of Beijing[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2020, 56(3): 518-530(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-BJDZ202003014.htm [9] 刘华斌. 手机信令数据背景下城市交通出行方式选择辨识方法研究[D]. 北京: 北京交通大学, 2019.LIU H B. Urban transportation modes recognition based on mobile signaling data[D]. Beijing: Beijing Jiaotong University, 2019(in Chinese). [10] 冀秉魁, 李杰, 姚雪萍, 等. 基于高德热力图的长春市停车规划分析[J]. 长春工程学院学报(自然科学版), 2018, 19(2): 29-31. doi: 10.3969/j.issn.1009-8984.2018.02.008JI B K, LI J, YAO X P, et al. The urban parking planning analysis in Changchun city based on the Gaode thermodynamic diagram[J]. Journal of Changchun Institute of Technology(Nature Science Edition), 2018, 19(2): 29-31(in Chinese). doi: 10.3969/j.issn.1009-8984.2018.02.008 [11] 周艺华, 李广辉, 杨宇光, 等. 基于GeoHash的近邻查询位置隐私保护方法[J]. 计算机科学, 2019, 46(8): 212-216. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJA201908037.htmZHOU Y H, LI G H, YANG Y G, et al. Location privacy preserving nearest neighbor querying based on GeoHash[J]. Computer Science, 2019, 46(8): 212-216(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJA201908037.htm [12] 史新颖, 夏元平, 毛曦, 等. DBSCAN与K-Means相结合的手机大数据聚类方法研究[J]. 北京测绘, 2019, 33(2): 132-137. https://www.cnki.com.cn/Article/CJFDTOTAL-BJCH201902002.htmSHI X Y, XIA Y P, MAO X, et al. Research on mobile big data clustering method based on DBSCAN and K-Means[J]. Beijing Surveying and Mapping, 2019, 33(2): 132-137(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-BJCH201902002.htm [13] 杜翠凤, 余艺, 蒋超. 基于空间密度聚类的移动用户热点区域识别方法[J]. 移动通信, 2015, 39(16): 40-43. https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX201516009.htmDU C F, YU Y, JIANG C. Mobile user hotspot recognition method based on space density clustering[J]. Mobile Communications, 2015, 39(16): 40-43(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-YDTX201516009.htm [14] 孙强. 空间K-Means挖掘算法在基于手机信令的OD分析中的应用[J]. 科技资讯, 2018, 16(36): 14-16. https://www.cnki.com.cn/Article/CJFDTOTAL-ZXLJ201836008.htmSUN Q. Application of spatial K-Means mining algorithm in OD analysis based on mobile phone signaling[J]. Science and Technology Information, 2018, 16(36): 14-16(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-ZXLJ201836008.htm [15] 苗壮. 基于手机信令数据的数据清洗挖掘与居民职住空间分析[D]. 成都: 西南交通大学, 2017.MIAO Z. Research on data cleaning, mining, jobs and residential locations based on mobile phone signaling data[D]. Chengdu: Southwest Jiaotong University, 2017(in Chinese). [16] 余锦斌. 基于手机信令的数据分析引擎设计与实现[D]. 南京: 东南大学, 2018.YU J B. Design and implementation of analysis engine based on mobile phone data[D]. Nanjing: Southeast University, 2018(in Chinese). [17] BIANCHINI F. Night cultures, night economies[J]. Planning Practice and Research, 1995, 10(2): 121-126. doi: 10.1080/02697459550036667 [18] 吴敏. 河北省城市夜经济研究[D]. 天津: 河北工业大学, 2015.WU M. Research on Hebei province urban night economy[D]. Tianjin: Hebei University of Technology, 2015(in Chinese). [19] ZHU X H, QIAN T N, WEI Y G. Do high-speed railways accelerate urban land expansion in China A study based on the multi-stage difference-in-differences model[J]. Socio-Economic Planning Sciences, 2020, 71: 100846. doi: 10.1016/j.seps.2020.100846