-
摘要:
轨迹数据对各种应用都具有重要的价值,然而共享与利用轨迹数据的同时保护用户隐私是一个长期的挑战。目前流行的轨迹共享隐私保护方法是基于差分隐私技术来生成与真实轨迹高度相似的完整合成轨迹,然而这些方法会造成合成轨迹可用性低且易遭受位置隐私推理攻击的问题。为此,提出一种效用增强的差分私有轨迹合成方法(UtiE-DPT)。该方法通过多次空间划分真实轨迹数据集,构建了细粒度的自适应密度网格来离散化真实轨迹,并设计了适合于自适应密度网格的差分私有马尔可夫移动模型、轨迹行程分布和轨迹长度分布计算模型来更精确地提取真实轨迹的效用保持关键统计特征,有效增强了合成轨迹的效用。为了防止轨迹隐私泄露,对提取的关键统计特征进行差分私有扰动。根据提取的特征并结合抗攻击约束策略生成抵御推理攻击的合成轨迹。在模拟与真实数据集上的大量实验结果表明,与现有的轨迹合成隐私保护方法DP-Star和AdaTrace相比,UtiE-DPT在保护轨迹隐私和抵御位置隐私推理攻击的同时增强了合成轨迹的可用性。在不考虑抵御推理攻击时,UtiE-DPT生成合成轨迹的查询误差比AdaTrace降低了21%~27%,比DP-Star降低了32%~53%;在抵御推理攻击后,虽然合成轨迹的弹性比比AdaTrace降低了约1%~2%,但查询误差比AdaTrace降低了16%~21%,在隐私保护与效用之间达成更好的均衡。
Abstract:Trajectory data is valuable for a variety of applications. However, it has been a long-standing challenge to share and utilize trajectory data while protecting users’ privacy. Currently, the prevailing shared trajectory privacy-preserving methods are to generate complete synthetic trajectories that are highly similar to real trajectories based on differential privacy, which results in poor utility of synthesis trajectories and is vulnerable to location privacy inference attacks. To address these problems, this paper proposed a utility-enhanced synthesis method of differentially private trajectories (UtiE-DPT). By dividing the real trajectory dataset spatially, this method constructed a fine-grained adaptive density grid structure to discretize the real trajectory and designed a Markov transition matrix, trajectory travel distribution, and trajectory length distribution calculation model suitable for the adaptive density grid structure, so as to extract key statistical features to maintain the utility of real trajectories and thus enhance the utility of synthetic trajectories. To preserve the users’ privacy, the differential privacy technique was employed to perturb these key statistical features. Finally, a synthetic trajectory against inference attacks was generated according to the extracted features and anti-attack constraint strategy. Comprehensive experiments on real datasets and simulation datasets show that compared with the existing trajectory synthesis privacy protection methods such as DP-Star and AdaTrace, the UtiE-DPT not only protects trajectory privacy and resists location privacy inference attacks but also improves the utility of synthetic trajectories. Without considering the inference attacks, the query error of UtiE-DPT for generating synthetic trajectories is 21%–27% lower than AdaTrace and 32%–53% lower than DP-Star. After resisting the inference attacks, although the robustness of the synthetic trajectory is reduced by about 1%–2% compared with AdaTrace, the query error is reduced by 16%–21% compared with AdaTrace, achieving a better balance between privacy protection and utility.
-
Key words:
- trajectory privacy /
- differential privacy /
- trajectory publishing /
- Markov chain /
- inference attack
-
表 1 隐私分配效用对比
Table 1. Utility comparison of privacy allocation
${\varepsilon _1}$ ${\varepsilon _2}$ ${\varepsilon _3}$ ${\varepsilon _4}$ ${V_{{\text{QE}}}}$ Brinkhoff-20k Geolife 1/9 1/9 6/9 1/9 0.1282 0.2771 1/9 2/9 5/9 1/9 0.1275 0.2624 1/9 3/9 4/9 1/9 0.1201 0.2590 1/9 4/9 3/9 1/9 0.1101 0.2498 1/9 5/9 2/9 1/9 0.1653 0.3902 1/9 6/9 1/9 1/9 0.1737 0.4226 1/9 4/9 2/9 2/9 0.1511 0.3123 1/9 1/9 1/9 3/9 0.1911 0.3218 2/9 3/9 3/9 1/9 0.1449 0.3445 2/9 3/9 2/9 2/9 0.1690 0.3612 2/9 3/9 1/9 3/9 0.1837 0.3822 3/9 2/9 3/9 1/9 0.1490 0.3028 3/9 2/9 2/9 2/9 0.1780 0.3298 3/9 2/9 1/9 3/9 0.1804 0.3593 4/9 1/9 3/9 1/9 0.1544 0.2806 4/9 1/9 2/9 2/9 0.1568 0.3424 4/9 1/9 1/9 3/9 0.1935 0.3762 表 2 Brinkhoff-20k数据集合成轨迹效用对比
Table 2. Utility comparison of synthetic trajectory of Brinkhoff-20k dataset
对比方法 $\varepsilon $ ${V_{{\text{QE}}}}$ $ {V_{{\text{FP}}}} $ $ {V_{{\text{TE}}}} $ ${V_{{\text{LE}}}}$ ${V_{{\text{DE}}}}$ ${V_{{\text{FPS}}}}$ $ {V_{{\text{KT}}}} $ t/s AdaTrace 0.5 0.1561 0.4805 0.0448 0.0616 0.0354 0.5960 0.7173 4.6 1.0 0.1442 0.4100 0.0188 0.0516 0.0275 0.6810 0.7269 4.4 2.0 0.1247 0.4089 0.0118 0.0481 0.0263 0.6825 0.7255 4.4 DP-Star 0.5 0.1664 0.4291 0.0391 0.0712 0.0450 0.6510 0.7149 304.5 1.0 0.1512 0.3712 0.0296 0.0618 0.0421 0.6957 0.7214 283.3 2.0 0.1351 0.3704 0.0193 0.0589 0.0371 0.6983 0.7231 273.8 UtiE-DPT 0.5 0.1185 0.4652 0.0459 0.0664 0.0341 0.6600 0.7200 23.8 1.0 0.0971 0.4206 0.0227 0.0486 0.0217 0.6840 0.7354 18.9 2.0 0.0946 0.4414 0.0133 0.0472 0.0203 0.6800 0.7382 18.8 表 3 Geolife数据集合成轨迹效用对比
Table 3. Utility comparison of synthetic trajectory of Geolife dataset
对比方法 $\varepsilon $ ${V_{{\text{QE}}}}$ $ {V_{{\text{FP}}}} $ $ {V_{{\text{TE}}}} $ ${V_{{\text{LE}}}}$ ${V_{{\text{DE}}}}$ ${V_{{\text{FPS}}}}$ $ {V_{{\text{KT}}}} $ t/s AdaTrace 0.5 0.2992 0.8371 0.3000 0.0129 0.0928 0.5960 0.6204 2.1 1.0 0.2151 0.8299 0.2367 0.0039 0.0834 0.6810 0.6682 1.3 2.0 0.1809 0.7805 0.1811 0.0038 0.0867 0.6825 0.6793 1.1 DP-Star 0.5 0.4273 0.8811 0.2086 0.0637 0.1689 0.4800 0.6289 173.4 1.0 0.3831 0.8518 0.1618 0.0556 0.1636 0.5200 0.6376 170.2 2.0 0.3648 0.8103 0.1371 0.0476 0.1453 0.5500 0.6377 165.2 UtiE-DPT 0.5 0.2594 0.5229 0.1456 0.0408 0.0903 0.5900 0.6131 17.6 1.0 0.1739 0.4915 0.0836 0.0220 0.0533 0.6300 0.6477 15.9 2.0 0.1244 0.4869 0.0497 0.0144 0.0339 0.7000 0.6719 15.2 表 4 抵御推理攻击的合成轨迹效用对比
Table 4. Utility comparison of synthetic trajectories against inference attacks
数据集 对比方法 $\varepsilon $ ${V_{{\text{QE}}}}$ $ {V_{{\text{FP}}}} $ $ {V_{{\text{TE}}}} $ ${V_{{\text{LE}}}}$ ${V_{{\text{DE}}}}$ ${V_{{\text{FPS}}}}$ $ {V_{{\text{KT}}}} $ Brinkhoff-20k AdaTrace 0.5 0.1644 0.5447 0.0789 0.0518 0.0258 0.5300 0.6356 1.0 0.1634 0.4849 0.0557 0.0441 0.0201 0.6200 0.6394 2.0 0.1544 0.4788 0.0489 0.0406 0.0188 0.6400 0.6544 UtiE-DPT 0.5 0.1394 0.4867 0.0703 0.0482 0.0211 0.6200 0.6377 1.0 0.1391 0.4904 0.0565 0.0428 0.0174 0.5900 0.6467 2.0 0.1272 0.4911 0.0487 0.0352 0.0148 0.6200 0.6531 Geolife AdaTrace 0.5 0.3120 0.8520 0.3298 0.0100 0.0933 0.5300 0.5129 1.0 0.2341 0.8253 0.2673 0.0096 0.0923 0.5700 0.5769 2.0 0.2075 0.7949 0.1890 0.0059 0.0849 0.5900 0.5982 UtiE-DPT 0.5 0.2735 0.5326 0.2831 0.0519 0.0936 0.5400 0.5204 1.0 0.1900 0.5216 0.1982 0.0306 0.0859 0.5700 0.5793 2.0 0.1441 0.5138 0.1192 0.0273 0.0799 0.5900 0.5971 -
[1] 张学军, 桂小林, 伍忠东. 位置服务隐私保护研究综述[J]. 软件学报, 2015, 26(9): 2373-2395.ZHANG X J, GUI X L, WU Z D. Privacy preservation for location-based services: A survey[J]. Journal of Software, 2015, 26(9): 2373-2395 (in Chinese). [2] CHEN Z B, SHEN H T, ZHOU X F. Discovering popular routes from trajectories[C]// 2011 IEEE 27th International Conference on Data Engineering. Piscataway: IEEE Press, 2011: 900-911. [3] XU F L, ZHANG P Y, LI Y. Context-aware real-time population estimation for metropolis[C]// Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. New York: ACM, 2016: 1064-1075. [4] ZHOU C H, SU F Z, PEI T, et al. COVID-19: Challenges to GIS with big data[J]. Geography and Sustainability, 2020, 1(1): 77-87. doi: 10.1016/j.geosus.2020.03.005 [5] DING D Z, ZHANG M, PAN X D, et al. Geographical feature extraction for entities in location-based social networks[C]// Proceedings of the 2018 World Wide Web Conference on World Wide Web. New York: ACM, 2018: 833-842. [6] LIU Q, YU J, HAN J M, et al. Differentially private and utility-aware publication of trajectory data[J]. Expert Systems with Applications, 2021, 180: 115120. doi: 10.1016/j.eswa.2021.115120 [7] YUAN S L, PI D C, ZHAO X D, et al. Differential privacy trajectory data protection scheme based on R-tree[J]. Expert Systems with Applications, 2021, 182: 115215. doi: 10.1016/j.eswa.2021.115215 [8] WANG Y F, LI M Z, LUO S S, et al. LRM: A location recombination mechanism for achieving trajectory: Anonymity privacy protection[J]. IEEE Access, 2019, 7: 182886-182905. doi: 10.1109/ACCESS.2019.2960008 [9] LIU Z P, ZHAO X, DONG Y W, et al. Trajectory rotation privacy protection algorithm based on k anonymity[J]. Journal of Computer and Communications, 2018, 6(2): 36-47. doi: 10.4236/jcc.2018.62004 [10] 张少波, 王国军, 刘琴, 等. 基于多匿名器的轨迹隐私保护方法[J]. 计算机研究与发展, 2019, 56(3): 576-584. doi: 10.7544/issn1000-1239.2019.20180033ZHANG S B, WANG G J, LIU Q, et al. Trajectory privacy protection method based on multi-anonymizer[J]. Journal of Computer Research and Development, 2019, 56(3): 576-584 (in Chinese). doi: 10.7544/issn1000-1239.2019.20180033 [11] 陈传明, 林文诗, 俞庆英, 等. 一种基于单点收益的轨迹隐私保护方法[J]. 电子学报, 2020, 48(1): 143-152.CHEN C M, LIN W S, YU Q Y, et al. A trajectory privacy-preserving method based on single point gain[J]. Acta Electronica Sinica, 2020, 48(1): 143-152 (in Chinese). [12] LIN C Y. Suppression techniques for privacy-preserving trajectory data publishing[J]. Knowledge-Based Systems, 2020, 206: 106354. doi: 10.1016/j.knosys.2020.106354 [13] 叶阿勇, 孟玲玉, 赵子文, 等. 基于预测和滑动窗口的轨迹差分隐私保护机制[J]. 通信学报, 2020, 41(4): 123-133. doi: 10.11959/j.issn.1000-436x.2020049YE A Y, MENG L Y, ZHAO Z W, et al. Trajectory differential privacy protection mechanism based on prediction and sliding window[J]. Journal on Communications, 2020, 41(4): 123-133 (in Chinese). doi: 10.11959/j.issn.1000-436x.2020049 [14] 袁水莲, 皮德常, 胥萌. 基于差分隐私的轨迹隐私保护方法[J]. 电子学报, 2021, 49(7): 1266-1273. doi: 10.12263/DZXB.20200827YUAN S L, PI D C, XU M. Trajectory privacy protection method based on differential privacy[J]. Acta Electronica Sinica, 2021, 49(7): 1266-1273 (in Chinese). doi: 10.12263/DZXB.20200827 [15] ZHAO X D, PI D C, CHEN J F. Novel trajectory privacy-preserving method based on prefix tree using differential privacy[J]. Knowledge-Based Systems, 2020, 198: 105940. doi: 10.1016/j.knosys.2020.105940 [16] GURSOY M E, LIU L, TRUEX S, et al. Utility-aware synthesis of differentially private and attack-resilient location traces[C]// Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2018: 196-211. [17] KONSTANTINO C, CATUSCIA P, MARCO S. A predictive differentially-private mechanism for mobility traces[C]//Proceedings of International Symposium on Privacy Enhancing Technologies. Berlin: Springer, 2014: 21-41. [18] 陈思, 付安民, 苏铓, 等. 基于差分隐私的轨迹隐私保护方案[J]. 通信学报, 2021, 42(9): 54-64. doi: 10.11959/j.issn.1000-436x.2021168CHEN S, FU A M, SU M, et al. Trajectory privacy protection scheme based on differential privacy[J]. Journal on Communications, 2021, 42(9): 54-64 (in Chinese). doi: 10.11959/j.issn.1000-436x.2021168 [19] 李凤华, 张翠, 牛犇, 等. 高效的轨迹隐私保护方案[J]. 通信学报, 2015, 36(12): 114-123. doi: 10.11959/j.issn.1000-436x.2015320LI F H, ZHANG C, NIU B, et al. Efficient scheme for user’s trajectory privacy[J]. Journal on Communications, 2015, 36(12): 114-123 (in Chinese). doi: 10.11959/j.issn.1000-436x.2015320 [20] LIU X Y, CHEN J M, XIA X F, et al. Dummy-based trajectory privacy protection against exposure location attacks[C]//International Conference on Web Information Systems and Applications. Berlin: Springer, 2019: 368-381. [21] BINDSCHAEDLER V, SHOKRI R. Synthesizing plausible privacy-preserving location traces[C]// 2016 IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2016: 546-563. [22] HE X, CORMODE G, MACHANAVAJJHALA A, et al. DPT[J]. Proceedings of the VLDB Endowment, 2015, 8(11): 1154-1165. doi: 10.14778/2809974.2809978 [23] GURSOY M E, LIU L, TRUEX S, et al. Differentially private and utility preserving publication of trajectory data[J]. IEEE Transactions on Mobile Computing, 2019, 18(10): 2315-2329. doi: 10.1109/TMC.2018.2874008 [24] CUNNINGHAM T, CORMODE G, FERHATOSMANOGLU H. Privacy-preserving synthetic location data in the real world[C]// 17th International Symposium on Spatial and Temporal Databases. New York: ACM, 2021: 23-33. [25] YANG J Y, CHENG X, SU S, et al. Collecting individual trajectories under local differential privacy[C]// 2022 23rd IEEE International Conference on Mobile Data Management. Piscataway: IEEE Press, 2022: 99-108. [26] DU Y T, HU Y J, ZHANG Z K, et al. LDPTrace: Locally differentially private trajectory synthesis[J]. Proceedings of the VLDB Endowment, 2023, 16(8): 1897-1909. doi: 10.14778/3594512.3594520 [27] SHAO M L, LI J X, YAN Q B, et al. Structured sparsity model based trajectory tracking using private location data release[J]. IEEE Transactions on Dependable and Secure Computing, 2021, 18(6): 2983-2995. [28] DWORK C, KENTHAPADI K, MCSHERRY F, et al. Our data, ourselves: Privacy via distributed noise generation[M]// Annual International Conference on the Theory and Applications of Cryptographic Trchniques. Berlin: Springer, 2006: 486-503. [29] MCSHERRY F D. Privacy integrated queries: An extensible platform for privacy-preserving data analysis[C]// Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. New York: ACM, 2009: 19-30. [30] BRINKHOFF T. A framework for generating network-based moving objects[J]. GeoInformatica, 2002, 6(2): 153-180. doi: 10.1023/A:1015231126594 [31] ZHENG Y, ZHANG L Z, XIE X, et al. Mining interesting locations and travel sequences from GPS trajectories[C]// Proceedings of the 18th International Conference on World Wide Web. New York: ACM, 2009: 791-800.