-
摘要:
密码套件是安全传输层协议(TLS)实现安全通信的基石,包含了密钥交换算法、对称密码算法和消息摘要算法,其中对称密码算法被用于实际通信的数据加密。通过对真实流量的采集与分析,得出了不同TLS密码套件在现网中的分布情况。设计了一种基于密文图像重构、美国国家标准与技术研究院随机性测试套件、卷积神经网络(CNN)等手段的分析方法,对现网主流对称密码算法(AES、ChaCha20)与其他常见对称密码算法(DES、3DES、RC2、RC4)的密文随机性进行分析。实验结果表明:参与对比的所有对称密码算法在电子密码本(ECB)模式下其密文均具有较差的随机性,无法通过大多数测试;AES与ChaCha20二种主流TLS对称密码算法在除ECB模式下其密文均具有良好的随机性,对基于CNN与随机森林的密码算法识别也具有抵抗能力。研究成果可为TLS密码套件的选择与加密流量的深层分析提供参考。
Abstract:Cipher suite is the cornerstone of transport layer security (TLS) to realize secure communication, which includes asymmetric cipher algorithm, symmetric cipher algorithm and message digest algorithm, among which symmetric cipher algorithm is used for data encryption in actual communication. Through the collection and analysis of real traffic, this paper obtains the distribution of different TLS cipher suites in the existing network. Then, an analysis method based on image ciphertext reconstruction, NIST randomness test suite and convolutional neural network (CNN) is designed to analyze the ciphertext randomness of mainstream symmetric cipher algorithms (AES, ChaCha20) and other common symmetric cipher algorithms (DES, 3DES, RC2, RC4). The experimental results show that the ciphertexts of all the symmetric cipher algorithms participating in the comparison have poor randomness in the electronic codebook (ECB) mode and cannot pass most tests. AES and ChaCha20, two mainstream TLS symmetric cipher algorithms, have good randomness in ciphertext except ECB mode, and have resistance to cipher algorithm recognition based on CNN or random forest. Relevant research can provide reference for the deep analysis of TLS cipher suite selection and encrypted traffic.
-
表 1 现网流量中各TLS密码套件比例
Table 1. Proportion of TLS cipher suites in actual network traffic
密码套件 流数/条 占比/% TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 364 600 59.39 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 108 966 17.75 TLS_RSA_WITH_AES_256_GCM_SHA384 27 888 4.54 TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 27 043 4.40 TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 15 080 2.46 TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 13 258 2.16 TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA 11 933 1.94 TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA 11 106 1.81 TLS_RSA_WITH_AES_128_GCM_SHA256 9 788 1.59 TLS_RSA_WITH_AES_256_CBC_SHA 4 964 0.81 TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 3 776 0.62 TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA 2 393 0.39 TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 2 116 0.34 TLS_RSA_WITH_AES_128_CBC_SHA 2 112 0.34 TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA 1 721 0.28 TLS_RSA_WITH_AES_256_CBC_SHA256 1 569 0.26 TLS_ECDHE_RSA_WITH_RC4_128_SHA 1 354 0.22 TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 812 0.13 TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 688 0.11 TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 668 0.11 表 2 参与实验的密码算法与工作模式
Table 2. Cipher algorithms and working modes involved in experiment
密码算法 工作模式 AES ECB、CBC、CTR、GCM DES ECB、CBC 3DES ECB、CBC RC2 ECB、CBC RC4 ChaCha20 指标 原始数据 AES_ECB ChaCha20 AES_GCM RC4 指标 结论 指数 结论 指数 结论 指数 结论 指数 结论 monobit test 0 不通过 0 不通过 0.297 通过 0.403 通过 0.827 通过 frequency within block test 0 不通过 0 不通过 0.603 通过 0.181 通过 0.105 通过 runs test 0 不通过 0 不通过 0.764 通过 0.504 通过 0.926 通过 longest run ones in a block test 0.002 不通过 0.003 不通过 0.572 通过 0.611 通过 0.691 通过 binary matrix rank test 0 不通过 0 不通过 0.404 通过 0.787 通过 0.489 通过 dft test 0 不通过 0 不通过 0.013 通过 0.011 通过 0.829 通过 non overlapping template matching test 0.001 不通过 0.970 通过 1.000 通过 1.000 通过 0.983 通过 overlapping template matching test 0 不通过 0 不通过 0.887 通过 0.370 通过 0.567 通过 linear complexity test 0 不通过 0 不通过 0.303 通过 0.939 通过 0.047 通过 serial test 0 不通过 0 不通过 0.055 通过 0.257 通过 0.888 通过 approximate entropy test 0 不通过 0 不通过 0.054 通过 0.257 通过 0.887 通过 cumulative sums test 0 不通过 0 不通过 0.413 通过 0.290 通过 0.696 通过 random excursion test 0.001 不通过 0.173 通过 0.047 通过 0.057 通过 0.325 通过 random excursion variant test 0.414 通过 0.160 通过 0.187 通过 0.042 通过 0.035 通过 maulers universal test 0 不通过 0 不通过 0.335 通过 0.011 通过 0.048 通过 表 4 实验采用的密码套件
Table 4. Cipher suite used in experiment
编号 密码套件名称 0xc014 TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA 0xc030 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 0xcca8 TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 0xc011 TLS_ECDHE_RSA_WITH_RC4_128_SHA 表 5 随机森林二分类准确率
Table 5. Accuracy of random forest classification
类别1 类别2 准确率/% AES_ECB AES_CBC 55.3 AES_GCM 58.2 ChaCha20 57.1 RC4 57.9 AES_CBC AES_GCM 48.3 ChaCha20 47.9 RC4 50.0 AES_GCM ChaCha20 48.7 RC4 50.1 ChaCha20 RC4 48.4 -
[1] 吴杨, 王韬, 邢萌, 等. 基于密文随机性度量值分布特征的分组密码算法识别方案[J]. 通信学报, 2015, 36(4): 150-159. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201504016.htmWU Y, WANG T, XING M, et al. Block ciphers identification scheme based on the distribution character of randomness test values of ciphertext[J]. Journal on Communications, 2015, 36(4): 150-159(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201504016.htm [2] 丁伟, 谈程. 一种基于密文分析的密码识别技术[J]. 通信技术, 2016, 49(10): 1382-1386. doi: 10.3969/j.issn.1002-0802.2016.10.022DING W, TAN C. An approach of identifying cipher based on cipertext analysis[J]. Communications Technology, 2016, 49(10): 1382-1386(in Chinese). doi: 10.3969/j.issn.1002-0802.2016.10.022 [3] 黄良韬, 赵志诚, 赵亚群. 基于随机森林的密码体制分层识别方案[J]. 计算机学报, 2018, 41(2): 382-399. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201802008.htmHUANG L T, ZHAO Z C, ZHAO Y Q. A two-stage cryptosystem recognition scheme based on random forest[J]. Journal of Computers, 2018, 41(2): 382-399(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201802008.htm [4] 李洪超. 基于密文特征的密码算法识别研究[D]. 西安: 西安电子科技大学, 2018.LI H C. Cipher-text features based cipher system recognition[D]. Xi'an: Xidian University, 2018(in Chinese). [5] 赵志诚, 赵亚群, 刘凤梅. 基于随机性测试的分组密码体制识别方案[J]. 密码学报, 2019, 6(2): 177-190. https://www.cnki.com.cn/Article/CJFDTOTAL-MMXB201902004.htmZHAO Z C, ZHAO Y Q, LIU F M. Scheme of block ciphers recognition based on randomness test[J]. Journal of Cryptologic Research, 2019, 6(2): 177-190(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MMXB201902004.htm [6] 王旭, 陈永乐, 王庆生, 等. 结合特征选择与集成学习的密码体制识别方案[J]. 计算机工程, 2021, 47(1): 139-145. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202101019.htmWANG X, CHEN Y L, WANG Q S, el al. Cryptosystem identification scheme combining feature selection and ensemble learning[J]. Computer Engineering, 2021, 47(1): 139-145(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202101019.htm [7] DIERKS T. The transport layer security (TLS) protocol version 1.2[EB/OL](2020-01-21)[2020-07-05]. https://tools.ietf.org/html/rfc5246. [8] BRACEWELL R. The Fourier transform and its applications[J]. American Journal of Physics, 2002, 34(8): 712. http://www.eee.hku.hk/~work8501/FTapp/FT-FM.pdf [9] RUKHIN A, SOTA J, NECHVATAL J, et al. A statistical test suite for random and pseudorandom number generators for cryptographic applications: SP800-22 Revla. 1a[S]. Washington, D.C. : National Institute of Standards and Technology, 2010. [10] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20: 273-297. http://bmjopen.bmj.com/external-ref?access_num=10.1007/BF00994018&link_type=DOI [11] QUINLAN J R. Induction of decision trees[J]. Machine Learning, 1986, 1(1): 81-106. [12] HO T K. Random decision forests[C]//Proceedings of 3rd International Conference on Document Analysis and Recognition. Piscataway: IEEE Press, 1995: 278-282. [13] HASTIE T, TIBSHIRANI R, FRIEDMAN J. The elements of statistical learning: Data mining, inference and prediction[M]. Berlin: Springer, 2009. [14] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. http://www.researchgate.net/profile/Yann_Lecun/publication/2985446_Gradient-Based_Learning_Applied_to_Document_Recognition/links/0deec519dfa1983fc2000000/Gradient-Based-Learning-Applied-to-Document-Recognition.pdf [15] GRAVES A, LIWICKI M, FERNÁNDEZ S, et al. A novel connectionist system for unconstrained handwriting recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 31(5): 855-868. http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=1E59A7962580A746C9EDFEE0D18FCE2D?doi=10.1.1.139.4502&rep=rep1&type=pdf [16] ROKACH L. Ensemble-based classifiers[J]. Artificial Intelligence Review, 2010, 33(1-2): 1-39. [17] WILLIAM W. Enron email dataset[DS/OL]. [2020-08-01]. https://www.cs.cmu.edu/~./enron/.