基于EfficientDet的无预训练SAR图像船舶检测器

包壮壮; 赵学军

doi:10.13700/j.bh.1001-5965.2020.0255

基于EfficientDet的无预训练SAR图像船舶检测器

doi: 10.13700/j.bh.1001-5965.2020.0255

包壮壮,
赵学军^,

空军工程大学基础部, 西安 710051

详细信息

通讯作者:
赵学军. E-mail: 292457155@qq.com

中图分类号: TN957.51;TP751
计量
- 文章访问数: 459
- HTML全文浏览量: 53
- PDF下载量: 175
- 被引次数: 0
出版历程
- 收稿日期: 2020-06-11
- 录用日期: 2020-09-04
- 网络出版日期: 2021-08-20

Ship detector in SAR images based on EfficientDet without pre-training

BAO Zhuangzhuang,
ZHAO Xuejun^,

Department of Basic Science, Air Force Engineering University, Xi'an 710051, China

More Information

Corresponding author: ZHAO Xuejun. E-mail: 292457155@qq.com

摘要

摘要:
针对多尺度、多场景的合成孔径雷达(SAR)图像船舶检测问题，提出了一种基于EfficientDet的无预训练目标检测器。现有的基于卷积神经网络的SAR图像船舶检测器并没有表现出其应有的出色性能。重要原因之一是依赖分类任务的预训练模型，没有有效的方法来解决SAR图像与自然场景图像之间存在的差异性；另一个重要原因是没有充分利用卷积神经网络各层的信息，特征融合能力不够强，难以处理包括海上和近海在内的多场景船舶检测，尤其是无法排除近海复杂背景的干扰。SED就这2个方面改进方法，在公开SAR船舶检测数据集上进行实验，检测精度指标平均准确率(AP)达到94.2%，与经典的深度学习检测器对比，超过最优的RetineNet模型1.3%，在模型大小、算力消耗和检测速度之间达到平衡，验证了所提模型在多场景条件下多尺度SAR图像船舶检测具有优异的性能。
- 船舶检测 /
- 合成孔径雷达(SAR) /
- 深度学习 /
- 卷积神经网络 /
- 目标检测
Abstract:
Aiming at the problem of multi-scale and multi-scene Synthetic Aperture Radar (SAR) ship detection, an object detector without pre-training based on EfficientDet is proposed. The existing SAR image ship detectors based on convolutional neural networks do not show excellent performance that it should have. One of the important reasons is that they depend on the pre-training model of the classification tasks, and there is no effective method to solve the difference between the SAR image and the natural scene image. Another important reason is that the information of each layer of the convolutional network is not fully utilized, the feature fusion ability is not strong enough to deal with the detection of ships in multiple scenes including sea and offshore, and especially the interference of complex offshore background cannot be ruled out. SED improves the method in these two aspects, and conducts experiments on the public SAR ship detection data set. The detection accuracy index AP of SED reaches 94.2%, which, compared with the classic deep learning detector, has exceeded the best RetineNet model by 1.3%, and achieved a balance among model size, computing power consumption and detection speed. This verifies that the model can achieve excellent performance in multi-scale SAR image ship detection in multiple scenes.
- ship detection /
- Synthetic Aperture Radar (SAR) /
- deep learning /
- convolutional neural network /
- object detection

HTML全文

图 1 数据归一化的方式

Figure 1. Methods of data normalization

下载: 全尺寸图片幻灯片

图 2 两种卷积对比

Figure 2. Comparison of two convolutions

下载: 全尺寸图片幻灯片

图 3 残差模块和倒残差模块数据流图对比

Figure 3. Comparison of data flow graph between residual blocks and inverted residual blocks

下载: 全尺寸图片幻灯片

图 4 特征融合网络设计

Figure 4. Design of feature fusion network

下载: 全尺寸图片幻灯片

图 5 网络结构示意图

Figure 5. Schematic diagram of network structure

下载: 全尺寸图片幻灯片

图 6 复杂背景下的船舶数据集可视化

Figure 6. Visualization of a ship dataset in complex background

下载: 全尺寸图片幻灯片

图 7 不同模型预测结果可视化

Figure 7. Visualized prediction results of different models

下载: 全尺寸图片幻灯片

表 1 消融实验

Table 1. Ablation experiment

条件	组成
预训练	√
BN	√	√	√
GN				√	√
BiFPN	√	√		√		√
改进BiFPN			√		√
AP_0.5/%	92.3	93.4	93.5	93.7	94.2	Nan
AP_0.5∶0.95/%	60.0	59.9	60.6	63.3	64.7	Nan

下载: 导出CSV

表 2 不同模型结果对比

Table 2. Comparison of results among different models

指标	SSD300	SSD512	Faster R-CNN (R50)	RetinaNet (R50)	EfficientDet-D0 (预训练)	EfficientDet-D4 (预训练)	SED
AP_0.5/%	88.5	89.6	91.8	92.9	92.3	93.4	94.2
AP_0.5∶0.95/%	49.1	51.4	54.9	57.1	60.0	62.7	64.7
训练时长/min	10	43	23	15	14	195	19
测试时长/s	77	126	114	115	144	326	227
图像处理速度/(fp·s^-1)	56.6	36.3	38.6	38.0	30.5	13.5	19.3
模型大小/MB	190.0	195.0	247.6	303.2	15.7	83.2	15.4

下载: 导出CSV

参考文献(30)

[1]	KANJIR U, GREIDANUS H, KRIS ˇTOF O. Vessel detection and classification from spaceborne optical images: A literature survey[J]. Remote Sensing of Envioronment, 2018, 207: 1-26.
[2]	WANG Y, WANG C, ZHANG H, et al. Automatic ship detection based on retinanet using multi-resolution gaofen-3 imagery[J]. Remote Sensing, 2019, 11(5): 531. doi: 10.3390/rs11050531
[3]	EL-DARYMLI K, GILL E W, MCGUIRE P, et al. Automatic target recognition in synthetic aperture tadar imagery: A state-of-the-art review[J]. IEEE Access, 2016, 4: 6014-6058. doi: 10.1109/ACCESS.2016.2611492
[4]	YANG C S, PARK J H, RASHID A. An improved method of land masking for synthetic aperture radar-based ship detection[J]. Journal of Navigation, 2018, 71(4): 788-804. doi: 10.1017/S037346331800005X
[5]	MOLINA D E, GLEICH D, DATCU M, et al. Gibbs random field models for model-based despeckling of SAR images[J]. IEEE Geoscience and Remote Sensing Letters, 2009, 7(1): 73-77.
[6]	QIN X X, ZHOU S L, ZOU H X, et al. A CFAR detection algorithm for generalized gamma distributed background in high-resolution SAR images[J]. IEEE Geoscience and Remote Sensing Letters, 2013, 10(4): 806-810. doi: 10.1109/LGRS.2012.2224317
[7]	ZHAO J, ZHANG Z, YU W, et al. A cascade coupled convolutional neural network guided visual attention method for ship detection from SAR images[J]. IEEE Access, 2018, 6: 50693-50708. doi: 10.1109/ACCESS.2018.2869289
[8]	LI J, QU C, SHAO J. Ship detection in SAR images based on an improved faster R-CNN[C]//2017 SAR in Big Data Era: Models, Methods and Applications. Piscataway: IEEE Press, 2017: 1-6.
[9]	WANG Y, WANG C, ZHANG H, et al. A SAR dataset of ship detection for deep learning under complex backgrounds[J]. Remote Sensing, 2019, 11(7): 765. doi: 10.3390/rs11070765
[10]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 21-37.
[11]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems, 2015: 91-99.
[12]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007.
[13]	NOGUEIRA K, PENATTI O A, DOS SANTOS J A. Towards better exploiting convolutional neural networks for remote sensing scene classification[J]. Pattern Recognition, 2017, 61: 539-556. doi: 10.1016/j.patcog.2016.07.001
[14]	SHEN Z, LIU Z, LI J, et al. DSOD: Learning deeply supervised object detectors from scratch[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1919-1927.
[15]	ZHU R, ZHANG S, WANG X, et al. ScratchDet: Training single-shot object detectors from scratch[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 2268-2277.
[16]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2117-2125.
[17]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[18]	TAN M, PANG R, LE Q V. EfficientDet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 10781-10790.
[19]	GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1440-1448.
[20]	HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2961-2969.
[21]	REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. [2020-05-13].
[22]	TIAN Z, SHEN C, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]//2019 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9627-9636.
[23]	LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768.
[24]	GHIASI G, LIN TY, LE Q V. NAS-FPN: Learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7036-7045.
[25]	IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. New York: ACM Press, 2015: 448-456.
[26]	WU Y, HE K. Group normalization[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
[27]	HOWARD A G, ZHU M, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. [2020-05-13].
[28]	SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4510-4520.
[29]	CHEN X, FANG H, LIN T, et al. Microsoft COCO captions: Data collection and evaluation server[J]. (2015-04-03)[2020-05-13].
[30]	KINGMA D P, BA J. Adam: A method for stochastic optimization[EB/OL]. (2017-01-30)[2020-05-13].