基于多分辨率特征自选择的遮挡物识别算法

谢祥颖; 来广志; 那峙雄; 骆欣; 王栋

doi:10.13700/j.bh.1001-5965.2021.0289

基于多分辨率特征自选择的遮挡物识别算法

doi: 10.13700/j.bh.1001-5965.2021.0289

国网数字科技控股有限公司, 北京 100053

基金项目:

国家重点研发计划 2018YFB1500800

国家电网有限公司科技项目 SGTJDK00DYJS2000148

详细信息

通讯作者:
骆欣, E-mail: lx@ustc.edu.cn

中图分类号: TP391
计量
- 文章访问数: 313
- HTML全文浏览量: 138
- PDF下载量: 220
- 被引次数: 0
出版历程
- 收稿日期: 2021-06-02
- 录用日期: 2021-07-04
- 网络出版日期: 2021-07-23
- 整期出版日期: 2022-07-20

Occlusion recognition algorithm based on multi-resolution feature auto-selection

State Grid Digital Technology Holding Co., Ltd., Beijing 100053, China

Funds:

National Key R & D Program of China 2018YFB1500800

Technology Project of State Grid Corporation of China SGTJDK00DYJS2000148

More Information

Corresponding author: LUO Xin, E-mail: lx@ustc.edu.cn

摘要

摘要:
光伏组件的遮挡物识别是光伏运维系统中不可或缺的环节，传统识别算法多依赖人工巡检，成本高昂且效率低下。基于卷积神经网络，提出了一种面向光伏组件的遮挡物识别算法PORNet。通过引入特征金字塔，构建多个分辨率下具有丰富语义信息的图像特征，提升对遮挡物尺度和密度的敏感性。通过特征自选择，筛选出语义最具代表性的特征图，以加强物体环境的语义信息表达。用筛选出的特征图完成遮挡物识别，从而提升识别准确率。在自建光伏组件落叶遮挡数据集上进行了实验比较和分析，并对识别性能进行了评估，通过与现有物体识别算法相比，所提算法的准确率和召回率分别提升了9.21%和15.79%。
- 光伏组件 /
- 遮挡物识别 /
- 卷积神经网络 /
- 特征金字塔 /
- 特征自选择
Abstract:
The identification of obstructions of photovoltaic modules is an indispensable link in modern photovoltaic operation and maintenance systems. Traditional identification methods mostly rely on manual inspections, but they are costly and inefficient. Therefore, based on the convolutional neural network, PORNet, an occlusion recognition algorithm for photovoltaic modules, is proposed. By introducing feature pyramids, image features with rich semantic information at multiple resolutions are constructed, enhancing the sensitivity to the scale and density of occlusions. Through feature auto-selection, the most representative feature maps are screened out to strengthen the semantic information expression of the object contexts. Finally, the screened feature map is used to complete the occlusion recognition, improving the recognition accuracy. Experimental comparison and analysis are carried out on the self-built photovoltaic module falling leaf occlusion dataset, and the recognition performance is evaluated. Compared with existing object recognition methods, the accuracy and recall rate of the proposed method are increased by 9.21% and 15.79%, respectively.
- photovoltaic module /
- occlusion recognition /
- convolutional neural network /
- feature pyramid /
- feature auto-selection

HTML全文

图 1 本文算法整体流程示意图

Figure 1. Overall process of the proposed algorithm

下载: 全尺寸图片幻灯片

图 2 残差单元结构

Figure 2. Structure of residual unit

下载: 全尺寸图片幻灯片

图 3 实际实现中残差单元结构

Figure 3. Structure of residual unit in actual implementations

下载: 全尺寸图片幻灯片

图 4 特征提取网络结构

Figure 4. Structure of feature extraction network

下载: 全尺寸图片幻灯片

图 5 多分辨率特征提取网络结构

Figure 5. Structure of multiple resolution feature extraction network

下载: 全尺寸图片幻灯片

图 6 特征自选择模块结构

Figure 6. Structure of feature auto-selection module

下载: 全尺寸图片幻灯片

图 7 训练及测试样本示意图

Figure 7. Illustration of partial training and test samples

下载: 全尺寸图片幻灯片

图 8 不同尺度测试正样本示意图

Figure 8. Illustration of test positive samples with different scales

下载: 全尺寸图片幻灯片

图 9 易误召测试负样本示意图

Figure 9. Illustration of easy recalled test negative samples

下载: 全尺寸图片幻灯片

图 10 难召回测试正样本高激活区域可视化

Figure 10. Visualization of high response regions for hard test samples

下载: 全尺寸图片幻灯片

表 1 符号表示

Table 1. Summary of main notations' representation

符号	含义
y_i	第i张图片类别
ReLU	ReLU函数
Sigmoid	Sigmoid函数
BN	批归一化层
FC	全连接层
GAP	全局平均池化
L_cls	分类损失函数
Conv	卷积层

下载: 导出CSV

表 2 各模块特征图信息

Table 2. Feature map information of different modules

特征图名称	尺度	通道数
C₁	56	64
C₂	28	128
C₃	14	256
C₄	7	512
P₁	56	256
P₂	28	256
P₃	14	256
P₄	7	256

下载: 导出CSV

表 3 不同算法测试结果

Table 3. Test results of different algorithms

算法	准确率/%	召回率/%	AUC
VGG11	84.21	68.42	0.916 2
VGG13	86.84	73.68	0.941 8
VGG16	92.11	84.21	0.981 3
Res18	89.47	81.58	0.965 4
FuseRes18	89.47	78.95	0.943 2
PORNet	98.68	97.37	0.991 5

下载: 导出CSV

表 4 不同算法运行时参数

Table 4. Runtime parameters of different algorithms

算法	参数量/10⁶	MAC/10⁹	速度/(帧·s^-1)
VGG11	8.79	6.98	201.79
VGG13	8.97	10.43	180.74
VGG16	14.03	14.31	156.14
Res18	10.66	1.69	125.38
FuseRes18	13.15	4.07	120.08
PORNet	13.15	4.07	115.83

下载: 导出CSV

参考文献(23)

[1]	JIA D, WEI D, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
[2]	KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012: 1106-1114.
[3]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations, 2015: 1-14.
[4]	SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: A unified embedding for face recognition and clustering[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 815-823.
[5]	LIU W, WEN Y, YU Z, et al. SphereFace: Deep hypersphere embedding for face recognition[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6738-6746.
[6]	LI W, ZHU X, GONG S. Person re-identification by deep joint learning of multi-loss classification[C]//Proceedings of International Joint Conference on Artificial Intelligence. New York: ACM, 2017: 2194-2200.
[7]	ZHONG Z, LIANG Z, CAO D, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 3652-3661.
[8]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[9]	ZAGORUYKO S, KOMODAKIS N. Wide residual networks[C]//Proceedings of the British Machine Vision Conference, 2016: 1-12.
[10]	HUANG G, LIU Z, LAURENS V, et al. Densely connected convolutional networks[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2261-2269.
[11]	XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5987-5995.
[12]	SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5693-5703.
[13]	YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1-9.
[14]	DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 764-773.
[15]	LI D, HU J, WANG C, et al. Involution: Inverting the inherence of convolution for visual recognition[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021.
[16]	ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond empirical risk minimization[C]//International Conference on Learning Representations, 2018.
[17]	DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. (2017-08-15)[2021-06-01]. http://arxiv.org/abs/1708.04552.
[18]	YUN S, HAN D, OH S J, et al. CutMix: Regularization strategy to train strong classifiers with localizable features[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 6023-6032.
[19]	CUBUK E D, ZOPH B, MANE D, et al. AutoAugment: Learning augmentation policies from data[EB/OL]. (2018-05-24)[2021-06-01]. https://arxiv.org/abs/1805.09501.
[20]	HO D, LIANG E, STOICA I, et al. Population based augmentation: Efficient learning of augmentation policy schedules[C]//Proceedings of the 36th International Conference on Machine Learning, 2019: 2731-2741.
[21]	LIM S, KIM I, KIM T, et al. Fast AutoAugment[EB/OL]. (2019-05-01)[2021-06-01]. http://arxiv.org/abs/1905.00397.
[22]	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of IEEE Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 936-944.
[23]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[J] International Journal of Computer Vision, 2020, 128(2): 336-359.