Ship target recognition method based on multi-source image fusion for unmanned aerial vehicle aerial photography
-
摘要:
针对无人机航拍采集到的多源舰船图像,提出一种多源舰船图像融合识别方法,面对现实场景中存在诸多干扰,采用像素级融合方式,通过对红外、可见光舰船图像进行融合,再进行目标识别,相比于特征级识别方法,既可以降低网络对样本的依赖,又可以增强方法的可解释性。所提方法聚焦于解决因传感器参数不同存在的像素偏移情况,为避免传统图像配准容易产生的畸变、伪影等问题,将图像配准转化为端到端下的特征对齐,提出一种多源舰船图像融合识别网络,网络共由交叉调制特征提取模块、特征动态对齐模块、多粒度特征细化模块和金字塔特征融合模块组成,可以充分融合不同模态图像的特征和纹理细节,有效提升对舰船目标的识别性能。经实验验证:所提方法对多源图像的融合效果好、可解释性强,对舰船目标的识别准确度高、鲁棒性强。
Abstract:A multi-source ship image fusion recognition method is proposed for unmanned aerial vehicle aerial photography of multi-source ship images. In the face of many interferences in real scenes, pixel level fusion is adopted to fuse infrared and visible light ship images, and then perform target recognition. It can improve the algorithm's interpretability and lessen the network's reliance on samples as compared to feature level recognition techniques. This article focuses on solving the pixel offset caused by different sensor parameters. To eliminate the distortion and artifacts that traditional picture registration can readily cause, image registration has been turned into end-to-end feature alignment. A multi-source ship image fusion recognition network is proposed, which consists of a cross modulation feature extraction module, a feature dynamic alignment module, a multi granularity feature refinement module, and a pyramid feature fusion module. It can fully integrate the features and texture details of different modal images, effectively improving the recognition performance of ship targets. The approach described in this study has demonstrated great interpretability for multi-source images, good fusion performance, and high accuracy and robustness in recognizing ship targets through experimental verification.
-
表 1 舰船类别统计
Table 1. Ship category statistics
类型 数目 军舰 517 帆船 7642 邮轮 436 散货船 678 渔船 4251 游艇 21578 集装箱船 863 表 2 消融实验结果
Table 2. Ablation experimental results
模型 MI $ Q_{{\mathrm{AB}}/{\mathrm{F}}} $ SSIM EN Baseline 1.242 0.356 1.052 5.945 +CM 1.565 0.376 1.095 6.031 +FDAM 1.605 0.403 1.241 6.156 +CMF 1.689 0.467 1.264 6.347 +PFFM 1.897 0.478 1.279 6.779 +LFR 1.920 0.513 1.341 7.056 表 3 融合性能对比
Table 3. Comparison of fusion performance
方法 CE MI $ Q_{{\mathrm{AB}}/{\mathrm{F}}} $ $ Q_{{\mathrm{CB}}} $ SSIM $ Q_{{\mathrm{CV}}} $ SD EN DDcGAN 0.951 1.023 0.411 0.491 1.108 1125.2 48.08 7.500 FusionGAN 2.041 1.344 0.248 0.433 1.170 1149.1 23.78 6.892 NestFuse 0.910 1.916 0.497 0.527 1.304 573.2 41.34 7.024 PMG1 0.978 1.230 0.426 0.484 1.251 610.6 32.13 6.745 SDNet 1.076 0.997 0.434 0.517 1.227 902.4 25.30 6.868 U2Fusion 0.591 1.066 0.446 0.553 1.307 632.7 22.56 6.456 GTF 0.751 1.354 0.352 0.437 1.170 1561.7 25.46 6.347 SwinFusion 0.472 1.895 0.481 0.554 1.225 553.1 38.13 6.779 本文方法 0.454 1.920 0.513 0.561 1.341 516.6 41.38 7.056 表 4 识别性能对比
Table 4. Comparison of recognition performance
图像 识别准确度/% VGG16 ResNet50 DenseNet201 Inception-
v4SqueezeNet-v1.1 Xception AlexNet 融合图像 0.923 0.931 0.932 0.930 0.927 0.925 0.928 可见光
图像0.865 0.870 0.871 0.867 0.864 0.862 0.868 红外图像 0.848 0.853 0.844 0.846 0.841 0.837 0.839 -
[1] 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519.JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519(in Chinese). [2] 苑玉彬, 吴一全, 赵朗月, 等. 基于深度学习的无人机航拍视频多目标检测与跟踪研究进展[J]. 航空学报, 2023, 44(18): 28334.YUAN Y B, WU Y Q, ZHAO L Y, et al. Research progress of UAV aerial video multi-object detection and tracking based on deep learning[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(18): 28334(in Chinese). [3] 何友, 熊伟, 刘俊, 等. 海上信息感知与融合研究进展及展望[J]. 火力与指挥控制, 2018, 43(6): 1-10.HE Y, XIONG W, LIU J, et al. Review and prospect of research on maritime information perception and fusion[J]. Fire Control & Command Control, 2018, 43(6): 1-10(in Chinese). [4] ZHANG P F, LI T R, WANG G Q, et al. Multi-source information fusion based on rough set theory: a review[J]. Information Fusion, 2021, 68: 85-117. [5] 何友, 王国宏, 关欣, 等. 信息融合理论及应用[M].北京: 电子工业出版社, 2010.HE Y, WANG G H, GUAN X, et al. Information fusion theory and applications [M]. Beijing: Publishing House of Electronics Industry, 2010. [6] QIU X H, LI M, ZHANG L Q, et al. Deep convolutional feature fusion model for multispectral maritime imagery ship recognition[J]. Journal of Computer and Communications, 2020, 8(11): 23-43. [7] LIU J M, CHEN H, WANG Y. Multi-source remote sensing image fusion for ship target detection and recognition[J]. Remote Sensing, 2021, 13(23): 4852. [8] TANG L F, YUAN J T, ZHANG H, et al. PIAFusion: a progressive infrared and visible image fusion network based on illumination aware[J]. Information Fusion, 2022, 83-84: 79-92. [9] LI Y S, WEI F Y, ZHANG Y J, et al. HS2P: hierarchical spectral and structure-preserving fusion network for multimodal remote sensing image cloud and shadow removal[J]. Information Fusion, 2023, 94: 215-228. [10] ZHANG Y, XIAO Q L, DENG X Y, et al. A multi-source information fusion method for ship target recognition based on Bayesian inference and evidence theory[J]. Journal of Intelligent & Fuzzy, 2022, 42(3): 2331-2346. [11] LI M J, DONG Y B, WANG X L. Pixel level image fusion based the wavelet transform[C]//Proceedings of the 6th International Congress on Image and Signal Processing. Piscataway: IEEE Press, 2014: 995-999. [12] HAN X Y, LV T, SONG X Y, et al. An adaptive two-scale image fusion of visible and infrared images[J]. IEEE Access, 2019, 7: 56341-56352. [13] YOU T T, TANG Y. Visual saliency detection based on adaptive fusion of color and texture features[C]//Proceedings of the 3rd IEEE International Conference on Computer and Communications. Piscataway: IEEE Press, 2018: 2034-2039. [14] 杨曦, 张鑫, 郭浩远, 等. 基于不变特征的多源遥感图像舰船目标检测算法[J]. 电子学报, 2022, 50(4): 887-899.YANG X, ZHANG X, GUO H Y, et al. Invariant features based ship detection model for multi-source remote sensing images[J]. Acta Electronica Sinica, 2022, 50(4): 887-899(in Chinese). [15] WANG A L, JIANG J N, ZHANG H Y. Multi-sensor image decision level fusion detection algorithm based on D-S evidence theory[C]//Proceedings of the Fourth International Conference on Instrumentation and Measurement, Computer, Communication and Control. Piscataway: IEEE Press, 2014: 620-623. [16] PAUL P P, GAVRILOVA M L, ALHAJJ R. Decision fusion for multimodal biometrics using social network analysis[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2014, 44(11): 1522-1533. [17] ZHANG T W, ZHANG X L, LI J W, et al. SAR ship detection dataset (SSDD): official release and comprehensive data analysis[J]. Remote Sensing, 2021, 13(18): 3690. [18] REN Y M, YANG J, ZHANG Q N, et al. Ship recognition based on Hu invariant moments and convolutional neural network for video surveillance[J]. Multimedia Tools and Applications, 2021, 80(1): 1343-1373. [19] ER M J, ZHANG Y N, CHEN J, et al. Ship detection with deep learning: a survey[J]. Artificial Intelligence Review, 2023, 56(10): 11825-11865. [20] SALI S M, MANISHA N L, KING G, et al. A review on object detection algorithms for ship detection[C]//Proceedings of the 7th International Conference on Advanced Computing and Communication Systems. Piscataway: IEEE Press, 2021: 1-5. [21] MA J Y, JIANG J J, ZHOU H B, et al. Guided locality preserving feature matching for remote sensing image registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(8): 4435-4447. [22] AYMERICH F, STASZEWSKI W J. Experimental study of impact-damage detection in composite laminates using a cross-modulation vibro-acoustic technique[J]. Structural Health Monitoring, 2010, 9(6): 541-553. [23] KLUS S, NÜSKE F, HAMZI B. Kernel-based approximation of the Koopman generator and Schrödinger operator[J]. Entropy, 2020, 22(7): 722. [24] QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2022: 763-772. [25] LI J X, GUO X B, LU G M, et al. DRPL: deep regression pair learning for multi-focus image fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4816-4831. [26] LI J Y, ZHA S, CHEN C, et al. Attention guided global enhancement and local refinement network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2022, 31: 3211-3223. [27] JOHNSON J, ALAHI A, LI F F. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the Computer Vision – ECCV. Berlin: Springer, 2016: 694-711. [28] SONG H H, XU W J, LIU D, et al. Multi-stage feature fusion network for video super-resolution[J]. IEEE Transactions on Image Processing, 2021, 30: 2923-2934. [29] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2024-03-01]. https://arxiv.org/abs/1409.1556. [30] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. -


下载: