-
摘要:
不可达节点是指比特币网络中不接收外部连接请求的网络工作节点,发现、验证均较为困难。现有研究大多集中于可达节点,而对不可达节点的研究较少。为此,提出一种基于决策树算法的不可达节点发现方法,可以从大量比特币地址中自动分类发现不可达节点。实验结果表明:所提方法在实验数据集上分类准确率为95.73%,召回率为91.97%;在真实数据上进行实测,并利用网络空间搜索引擎进行验证,所提方法实际分类准确率为53.75%,召回率约为76.86%。对实验中发现不可达节点的总量、地理分布、所属网络服务商等进行统计分析,为比特币监管工作提供有力技术支撑。
Abstract:Unreachable nodes refer to nodes that don't accept connection requests in the Bitcoin network, which are difficult to detect and verify. The existing studies mostly focused on the reachable nodes, but less on the unreachable nodes. A new approach is proposed to find the unreachable nodes based on a decision tree model, which can automatically classify unreachable nodes from a large numberof Bitcoin addresses. The results show that the proposed approach got an accuracy of 95.73% and a recall of 91.97% on the experimental dataset. The author applied the approach to the real dataset and verified it by the cyberspace search engines. The proposed approach’s accuracy was 53.75% and the recall was about 76.86%. The distribution of network providers, geographical areas, and the overall number of Unreachable nodes were discussed, which provided technical support for Bitcoin supervision.
-
Key words:
- Bitcoin /
- unreachable nodes /
- reachable nodes /
- decision tree /
- cyberspace search engine
-
表 1 决策树深度与分类效果
Table 1. Decision tree depth and classification effect
树深度 准确率/% 召回率/% F1 1 85.84 99.26 0.8625 2 88.86 94.54 0.8836 3 89.10 97.20 0.8887 4 92.85 95.39 0.9227 5 95.72 91.98 0.9506 10 97.99 95.93 0.9771 15 99.32 98.83 0.9924 20 99.73 99.57 0.9969 22 99.76 99.66 0.9973 25 99.75 99.64 0.9972 30 99.75 99.64 0.9972 表 2 节点最小样本数与分类效果
Table 2. Node's minimum sample and classification effect
最小样本数 准确率/% 召回率/% F1 100 95.72 91.98 0.9506 200 95.73 91.97 0.9381 300 95.67 92.19 0.9501 400 95.61 9.17 0.9492 500 95.44 91.27 0.9472 600 95.27 92.15 0.9458 表 3 优化前后分类效果
Table 3. Classification effect before and after optimization
分类算法 准确率/% 召回率/% F1 决策树(默认参数) 85.84 86.67 0.8625 决策树(优化后) 95.73 91.97 0.9381 表 4 检索记录与相应的节点类型
Table 4. Searching records and corresponding node types
检索
记录可达
节点不可达
节点离线可达
节点离线不可达
节点虚假
节点是否可以
检索到记录是 是 是 是 否 是否提供
比特币服务是 否 是 否 否 是否超出
扫描周期是 是 否 否 否 表 5 真实数据分类
Table 5. Real data classification
可达节点 不可达节点 离线可达节点 离线不可达节点 虚假节点 总计 0 5688 113 2338 2444 10583 表 6 分类效果
Table 6. Classification results
数据集 准确率/% 召回率/% F1 实验数据集 95.73 91.97 0.9381 真实数据 53.75 76.86 0.6326 表 7 节点地理分布
Table 7. Geographical distribution of nodes
地区 所占比例/% 可达节点[4] 不可达节点 欧洲 58.07 50.21 美洲 30.91 29.91 亚洲 9.43 18.18 大洋洲 1.36 1.16 非洲 0.22 0.54 表 8 所属网络服务商统计
Table 8. Statistics of networkservice providers
网络服务商 不可达节点数量 占比/% amazon.com 372 7.3 hetzner.com 293 5.7 digitalocean.com 276 5.4 China Telecom 178 3.5 comcast.com 162 3.2 -
[1] NAKAMOTO S. Bitcoin: A peer-to-peer electronic cash system[EB/OL]. (2009-03-19)[2022-04-22]. [2] 蔡晓晴, 邓尧, 张亮, 等. 区块链原理及其核心技术[J]. 计算机学报, 2021, 44(1): 84-131.CAI X Q, DENG Y, ZHANG L, et al. The principle and core technology of blockchain[J]. Chinese Journal of Computers, 2021, 44(1): 84-131 (in Chinese). [3] CASTROS. Bitcoin P2P network sniffer[EB/OL]. (2012-08-26)[2022-04-22]. [4] LI R G, ZHU J W, XU D W, et al. Bitcoin network measurement and a new approach to infer the topology[J]. China Communications, 2022, 19(10): 169-179. doi: 10.23919/JCC.2022.00.030 [5] LI R G, SHEN M, YU H, et al. A survey on cyberspace search engines[C]//Proceedings of the China Cyber Security Annual Conference. Berlin: Springer, 2020: 206-214. [6] DONET J A , PÉREZ-SOLA C, HERRERA-JOANCOMARTÍ J. The bitcoin P2P network[C]//Proceedings of the International Conference on Financial Cryptography and Data Security. Berlin: Springer, 2014: 87-102. [7] FADHIL M, OWENSON G, ADDA M. A Bitcoin model for evaluation of clustering to improve propagation delay in Bitcoin network[C]//Proceedings of the IEEE International Conference on Computational Science and Engineering and IEEE International Conference on Embedded and Ubiquitous Computing and 15th International Symposiumon Distributed Computing and Applications for Business Engineering. Piscataway: IEEE Press, 2016: 468-475. [8] PARK S, IM S, SEOL Y, et al. Nodes in the Bitcoin network: Comparative measurement study and survey[J]. IEEE Access, 2019, 7: 57009-57022. doi: 10.1109/ACCESS.2019.2914098 [9] BIRYUKOV A, KHOVRATOVICH D, PUSTOGAROV I. Deanonymisation of clients in Bitcoin P2P network[C]//Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2014: 15-29. [10] NEUDECKER T, ANDELFINGER P, HARTENSTEIN H. Timing analysis for inferring the topology of the Bitcoin peer-to-peer network[C]//Proceedings of the International IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress. Piscataway: IEEE Press, 2016: 358-367. [11] WANG L, PUSTOGAROV I. Towards better understanding of Bitcoin unreachable peers[EB/OL]. (2017-09-20)[2022-04-27]. [12] GRUNDMANN M, AMBERG H, HARTENSTEIN H. On the estimation of the number of unreachable peers in the Bitcoin P2P network by observation of peer announcements[EB/OL]. (2021-02-25)[2022-04-27]. [13] FRANZONI F, DAZA V. Improving Bitcoin transaction propagation by leveraging unreachable nodes[C]//Proceedings of the IEEE International Conference on Blockchain. Piscataway: IEEE Press, 2020: 196-203. [14] BIRYUKOV A, PUSTOGAROV I. Bitcoin over tor isn’t a good idea[C]//Proceedings of the IEEE Symposium on Security and Privacy. Piscataway: IEEE Press, 2015: 122-134. [15] MASTAN I D, PAUL S. A new approach to deanonymization of unreachable Bitcoin nodes[C]//Proceedings of the International Conference on Cryptology and Network Security. Berlin: Springer, 2018: 277-298. [16] PAPPALARDO G, CALDARELLI G, ASTE T. The bitcoin peers network[EB/OL]. (2016-06-30)[2022-04-28]. [17] SALLAL M, DE FRÉIN R, MALIK A, et al. An empirical comparison of the security and performance characteristics of topology formation algorithms for Bitcoin networks[J]. Array, 2022, 15: 100221. doi: 10.1016/j.array.2022.100221