Traffic classification algorithm of Internet of things devices based on random forest
-
摘要:
物联网(IoT)设备流量分类对网络资产管理有重要意义,基于流量统计的分类技术是当前研究热点。已有算法主要基于流信息建立特征向量,而对数据包信息利用较少。改进了基于随机森林的物联网设备流量分类算法,基于流信息和流数据包信息共同建立特征向量。实验结果表明:所提算法与其他算法相比,所提算法的平均分类准确率由56%提高到82%,平均召回率由47%提高到67%,平均
F 1得分由0.43提高到0.74,混淆矩阵对比也有明显提升,因此具备更好的分类效果。Abstract:The traffic classification of Internet of things (IoT) devices is very important to the management of cyberspace assets. The classification technology based on statistical identification is a hot spot in current academic research. The previous algorithms were mainly based on the flow information to set up the feature vectors, but lesson the packet information. In this paper, we improve the traffic classification algorithm of IoT devices based on random forest. We set up the feature vectors with both the flow information and the flow's packet information. The experimental results show that, compared with previous algorithms, the classification accuracy of the proposed algorithm increases from 56% to 82%, the recall rate improves from 47% to 67%, the
F 1 score increases from 0.43 to 0.74, and the confusion matrix correlation is also significantly improved. As a result, the proposed algorithm has a better classification effect than previous ones. -
表 1 物联网设备
Table 1. IoT device list
编号 设备名称 设备类型 1 Smart Things 物联网设备 2 Amazon Echo 物联网设备 3 Netatmo Welcome 物联网设备 4 TP-Link Day Night Cloud camera 物联网设备 5 Samsung SmartCam 物联网设备 6 Dropcam 物联网设备 7 Insteon Camera 1 物联网设备 8 Insteon Camera 2 物联网设备 9 Withings Smart Baby Monitor 物联网设备 10 Belkin Wemo switch 物联网设备 11 TP-Link Smart plug 物联网设备 12 iHome 物联网设备 13 Belkin wemo motion sensor 物联网设备 14 NEST Protect smoke alarm 物联网设备 15 Netatmo weather station 物联网设备 16 Withings Smart scale 物联网设备 17 Blipcare Blood Pressure meter 物联网设备 18 Withings Aura smart sleep sensor 物联网设备 19 LiFX Smart Bulb 物联网设备 20 Triby Speaker 物联网设备 21 PIX-STAR Photo-frame 物联网设备 22 HP Printer 非物联网设备 23 Samsung Galaxy Tab 非物联网设备 24 Nest Dropcam 非物联网设备 25 Android Phone 非物联网设备 26 Laptop 非物联网设备 27 MacBook 非物联网设备 28 Android Phone 非物联网设备 29 IPhone 非物联网设备 30 MacBook/Iphone 非物联网设备 表 2 特征向量
Table 2. Feature vectors
名称 释义 类型 流数量 每条样本中包含流的数量 流信息 协议种类 每条样本中包含的协议种类 最常用协议 每条样本中包含数量最多的协议 端口数量 每条样本中连接过的端口数量 最常用端口 每条样本中连接次数最多的端口 发送总字节数 每条样本中发送总字节数 包信息 接收总字节数 每条样本中接收总字节数 发送总包数 每条样本中发送总包数 接收总包数 每条样本中接收总包数 发送包TTL 每条样本中首个发送数据包存活时间 接收包TTL 每条样本中首个接收数据包存活时间 首包发送窗口 每条样本中首条流的数据包发送窗口 首包接收窗口 每条样本中首条流的数据包接收窗口 分类 准确率/% 召回率/% F1得分 Amazon Echo 44 69 0.54 Android Phone 1 0 0 0 Android Phone 2 0 0 0 Belkin Wemo switch 42 37 0.39 Belkin wemo motion sensor 43 51 0.47 Dropcam 66 100 0.79 HP Printer 98 44 0.61 Insteon Camera 1 58 94 0.72 Laptop 0 0 0 LiFX Smart Bulb 0 0 0 MacBook 1 12 0.01 NEST Protect smoke alarm 0 0 0 Nest Dropcam 0 0 0 Netatmo Welcome 24 12 0.16 Netatmo weather station 100 5 0.09 Samsung Galaxy Tab 0 0 0 Samsung SmartCam 55 50 0.52 Smart Things 91 47 0.62 TP-Link Day Night Cloud camera 0 0 0 TP-Link Smart plug 0 0 0 Triby Speaker 65 0 0.01 Withings Aura smart sleep sensor 62 68 0.65 Withings Smart Baby Monitor 33 74 0.45 表 4 本文算法的分类效果
Table 4. Classification results of proposed algorithm
分类 准确率/% 召回率/% F1得分 Amazon Echo 69 84 0.76 Belkin Wemo switch 60 58 0.59 Belkin wemo motion sensor 63 29 0.4 Dropcam 100 99 1 HP Printer 86 66 0.75 Insteon Camera 1 87 99 0.92 LiFX Smart Bulb 97 41 0.57 NEST Protect smoke alarm 0 0 0 Nest Dropcam 0 0 0 Netatmo Welcome 64 61 0.62 Netatmo weather station 98 86 0.91 PIX-STAR Photo-frame 0 0 0 Samsung SmartCam 58 92 0.71 Smart Things 97 86 0.91 TP-Link Day Night Cloud camera 0 0 0 TP-Link Smart plug 0 0 0 Triby Speaker 89 43 0.59 Withings Aura smart sleep sensor 91 72 0.81 Withings Smart Baby Monitor 60 91 0.73 Withings Smart scale 0 0 0 iHome 100 9 0.16 non-IOT 89 53 0.66 表 5 算法性能对比
Table 5. Algorithm performance comparison
算法 运算时间/s 内存/MB 文献[15]算法 14.5 342 本文算法 18 451 -
[1] 黄凯奇, 陈晓棠, 康运锋, 等. 智能视频监控技术综述[J]. 计算机学报, 2015, 38(6): 1093-1118. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201506001.htmHUANG K Q, CHEN X T, KANG Y F, et al. Intelligent visual surveillance: A review[J]. Chinese Journal of Computers, 2015, 38(6): 1093-1118(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201506001.htm [2] FENG X, LI Q, HAN Q, et al. Identification of visible industrial control devices at Internet scale[C]//2016 IEEE International Conference on Communications. Piscataway: IEEE Press, 2016: 1-6. [3] LI Q, FENG X, WANG H, et al. Automatically discovering surveillance devices in the cyberspace[C]//The 8th ACM. New York: ACM, 2017: 331-342. [4] FENG X, LI Q, WANG H, et al. Acquisitional rule-based engine for discovering Internet-of-thing devices[C]//27th USENIX Security Symposium, 2018: 327-341. [5] LEONARD D, LOGUINOV D. Demystifying service discovery: Implementing an Internet-wide scanner[C]//Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement 2010. New York: ACM, 2010: 109-122. [6] KOHNO T, BROIDO A, CLAFFY K C. Remote physical device fingerprinting[J]. IEEE Transactions on Dependable and Secure Computing, 2005, 2(2): 93-108. doi: 10.1109/TDSC.2005.26 [7] ANEJA S, ANEJA N, ISLAM M S. IoT device fingerprint using deep learning[C]//2018 IEEE International Conference on Internet of Things and Intelligence System. Piscataway: IEEE Press, 2018: 174-179. [8] HUSÁK M, ERMÁK M, JIRSÍK T, et al. HTTPS traffic analysis and client identification using passive SSL/TLS fingerprinting[J]. EURASIP Journal on Information Security, 2016, 2016(1): 1-14. doi: 10.1186/s13635-015-0028-6 [9] ARUNAN S, HASSAN H G, FRANCO L, et al. Classifying IoT devices in smart environments using network traffic characteristics[J]. IEEE Transactions on Mobile Computing, 2019, 18(8): 1745-1759. doi: 10.1109/TMC.2018.2866249 [10] MSADEK N, SOUA R, ENGEL T. IoT device fingerprinting: Machine learning based encrypted traffic analysis[C]//2019 IEEE Wireless Communications and Networking Conference (WCNC). Piscataway: IEEE Press, 2019: 1-8. [11] YAO H, GAO P, WANG J, et al. Capsule network assisted IoT traffic classification mechanism for smart cities[J]. IEEE Internet of Things Journal, 2019, 6(5): 7515-7525. doi: 10.1109/JIOT.2019.2901348 [12] DESAI B A, DIVAKARAN D M, NEVAT I, et al. A feature-ranking framework for IoT device classification[C]//International Conference on Communication Systems & Networks, 2019: 64-71. [13] MEIDAN Y, BOHADANA M, SHABTAI A, et al. ProfilIoT: A machine learning approach for IoT device identification based on network traffic analysis[C]//Proceedings of the Symposium on Applied Computing, 2017: 506-509. [14] SHAHID M R, BLANC G, ZHANG Z, et al. IoT devices recognition through network traffic analysis[C]//IEEE International Conference on Big Data. Piscataway: IEEE Press, 2018: 5187-5192. [15] SIVANATHAN A, SHERRATT D, GHARAKHEILI H H, et al. Characterizing and classifying IoT traffic in smart cities and campuses[C]//IEEE INFOCOM 2017-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Piscataway: IEEE Press, 2017: 559-564.