A graph convolution network based latency prediction algorithm for convolution neural network

LI Zheyang; ZHANG Ruyi; TAN Wenming; REN Ye; LEI Ming; WU Hao

doi:10.13700/j.bh.1001-5965.2021.0149

Volume 48 Issue 12

Dec. 2022

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2022 > 48(12): 2450-2459.

LI Zheyang, ZHANG Ruyi, TAN Wenming, et al. A graph convolution network based latency prediction algorithm for convolution neural network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(12): 2450-2459. doi: 10.13700/j.bh.1001-5965.2021.0149(in Chinese)

Citation:

LI Zheyang, ZHANG Ruyi, TAN Wenming, et al. A graph convolution network based latency prediction algorithm for convolution neural network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(12): 2450-2459. doi: 10.13700/j.bh.1001-5965.2021.0149(in Chinese)

Citation:

PDF( 4470 KB)

A graph convolution network based latency prediction algorithm for convolution neural network

doi: 10.13700/j.bh.1001-5965.2021.0149

LI Zheyang¹,
ZHANG Ruyi¹,
TAN Wenming^{1, 2
,
,},
REN Ye¹,
LEI Ming¹,
WU Hao¹

1.
Hangzhou Hikvision Digital Technology Co., Ltd., Hangzhou 310051, China
2.
Hangzhou Hikvision System Technology Co., Ltd., Hangzhou 310051, China

Funds:

National Key R & D Program of China 2018YFC0807706

More Information

Corresponding author: TAN Wenming, E-mail: tanwenming@hikvision.com
Received Date: 29 Mar 2021
Accepted Date: 06 Jun 2021
Publish Date: 29 Jun 2021

Abstract

Abstract

Obtaining the inference latency of a convolution neural network (CNN) via learnable prediction algorithm have attracted more attention. Existing latency predictors suffer from two major problems. First, the high complexity of CNN design space requires tremendous cost of data collection. Second, traditional algorithms fail to accurately model the effect of the hardware complier's operator fusion on latency. To solve these problems, this paper proposes a latency predictor based on graph convolution network (GCN). This algorithm regards the latency of a complete network as accumulation of multi-node latency compensation, and utilizes graph convolution to model the effect caused by operator fusion. Furthermore, we propose a differential training algorithm to reduce the size of input space and improve the generalization of the algorithm. Experiments on HISI3559 in MB-C continuous search space show that our algorithm can reduce the average relative error from 302% to 5.3%. In addition, replacing the traditional latency predictor with the proposed predictor enables neural architecture search algorithms to find high precision networks with latency closer to the target.
- latency prediction,
- graph convolution network,
- deep learning,
- neural architecture search,
- model deployment

FullText(HTML)

References(27)

References

[1]	CHOI J, WANG Z, VEN KATARAMANI S, et al. PACT: Parameterized clipping activation for quantized neural networks[EB/OL]. (2018-07-17)[2021-03-15]. https://arxiv.org/abs/1805.06085.
[2]	ZHANG D, YANG J, YE D, et al. LQ-Nets: Learned quantization for highly accurate and compact deep neural networks[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 365-382.
[3]	CHEN T, MOREAU T, JIANG Z, et al. TVM: End-to-end optimization stack for deep learning[EB/OL]. (2018-10-05)[2021-03-15]. https://arxiv.org/abs/1802.04799.
[4]	YU J, HUANG T. AutoSlim: Towards one-shot architecture search for channel numbers[EB/OL]. (2019-06-01)[2021-03-15]. https://arxiv.org/abs/1903.11728.
[5]	YOU Z, YAN K, YE J, et al. Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks[EB/OL]. (2019-09-18)[2021-03-15]. https://arxiv.org/abs/1909.08174.
[6]	ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6848-6856.
[7]	SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4510-4520.
[8]	邱博, 刘翔, 石蕴玉, 等. 一种轻量化的多目标实时检测模型[J]. 北京航空航天大学学报, 2020, 46(9): 157-164. doi: 10.13700/j.bh.1001-5965.2020.0066 QUE B, LIU X, SHI Y Y, et al. A lightweight multi-objective real-time model for detection[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 157-164(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0066
[9]	DENTON E, ZAREMBA W, BRUNA J, et al. Exploiting linear structure within convolutional networks for efficient evaluation[EB/OL]. (2014-06-09)[2021-03-15]. https://arxiv.org/abs/1404.0736.
[10]	HAN S, MAO H, DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[EB/OL]. (2016-02-15)[2021-03-15]. https://arxiv.org/abs/1510.00149.
[11]	陈猛夫. 基于迁移学习的暴恐图像自动识别[J]. 北京航空航天大学学报, 2020, 46(9): 1677-1681. doi: 10.13700/j.bh.1001-5965.2020.0046 CHEN M F. Automatic recognition of riot images based on transfer learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1677-1681(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0046
[12]	张子昊, 王蓉. 基于MobileFaceNet网络改进的人脸识别方法[J]. 北京航空航天大学学报, 2020, 46(9): 1756-1762. doi: 10.13700/j.bh.1001-5965.2020.0049 ZHANG Z H, WANG R. An improved face recognition method based on MobileFaceNet network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1756-1762(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0049
[13]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[EB/OL]. (2016-01-06)[2021-03-05]. https://arxiv.org/abs/1506.01497.
[14]	岳邦铮, 韩松. 基于改进Faster R-CNN的SAR船舶目标检测方法[J]. 计算机与现代化, 2019(9): 90-95. https://www.cnki.com.cn/Article/CJFDTOTAL-JYXH201909018.htm YUE B Z, HAN S. SAR ship target detector with improved Faster R-CNN[J]. Computer and Modernization, 2019(9): 90-95(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JYXH201909018.htm
[15]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 21-37.
[16]	曹帅, 张晓伟, 马健伟. 基于跨尺度特征聚合网络的多尺度行人检测[J]. 北京航空航天大学学报, 2020, 46(9): 1786-1796. doi: 10.13700/j.bh.1001-5965.2020.0069 CAO S, ZHANG X W, MA J W. Multi-scale pedestrian detection based on cross-scale feature aggregation networks[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1786-1796(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0069
[17]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2021-03-15]. https://arxiv.org/abs/1409.1556.
[18]	ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2014: 818-833.
[19]	ZHONG Z, YAN J, WU W, et al. Practical block-wise neural network architecture generation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2423-2432.
[20]	TAN M, CHEN B, PANG R, et al. MnasNet: Platform-aware neural architecture search for mobile[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 2820-2828.
[21]	WU B, DAI X, ZHANG P, et al. FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 10734-10742.
[22]	CAI H, ZHU L, HAN S. ProxylessNAS: Direct neural architecture search on target task and hardware[EB/OL]. (2019-02-23)[2021-03-15]. https://arxiv.org/abs/1812.00332.
[23]	ZHANG L L, YANG Y, JIANG Y, et al. Fast hardware-aware neural architecture search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2020: 692-693.
[24]	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22)[2021-03-15]. https://arxiv.org/abs/1609.02907.
[25]	MORRIS C, RITZERT M, FEY M, et al. Weisfeiler and Leman go neural: Higher-order graph neural networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2019: 4602-4609.
[26]	GUO Z, ZHANG X, MU H, et al. Single path one-shot neural architecture search with uniform sampling[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 544-560.
[27]	MICHEL G, ALAOUI M A, LEBOIS A, et al. DVOLVER: Efficient pareto-optimal neural network architecture search[EB/OL]. (2019-02-05)[2021-03-15]. https://arxiv.org/abs/1902.01654.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(7) / Tables(3)

Get Citation

PDF

XML

Article Metrics

Article views(400) PDF downloads(44)

A graph convolution network based latency prediction algorithm for convolution neural network

doi: 10.13700/j.bh.1001-5965.2021.0149

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

A graph convolution network based latency prediction algorithm for convolution neural network

doi: 10.13700/j.bh.1001-5965.2021.0149

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content