Citation: | GAO F,MENG D S,XIE Z Y,et al. Multi-source remote sensing image classification based on Transformer and dynamic 3D-convolution[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(2):606-614 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0397 |
Benefited from the complementarity and synergy of multi-source remote sensing data, deep learning-based methods have made significant progress in remote sensing image classification in recent years. Building a powerful multi-source data joint classification model is typically difficult for the following reasons: the feature fusion is hampered by the heterogeneous gap between HSI and LiDAR data; the representation power, efficiency, and interpretability are constrained by the current static inference paradigm.To solve both problems, we propose a Transformer-based fusion network. Specifically, to bridge the heterogeneous gap between HSI and LiDAR data, we design a feature fusion module based on Transformer to exploit the feature interactions between multi-source data. After that, we create a multi-scale dynamic 3D-convolution module to collect the information from different scales and use it to modulate the 3D-convolution kernel. The method was validated with Houston and Trento datasets. The overall accuracy of the proposed method reached 94.60% and 98.21% respectively. Compared with mainstream methods such as MGA-MFN, the overall accuracy of the two datasets was improved by at least 0.97% and 0.25% respectively. The experimental results demonstrate that our method can effectively improve the accuracy of multi-source remote sensing image classification.
[1] |
UEZATO T, FAUVEL M, DOBIGEON N. Hyperspectral image unmixing with LiDAR data-aided spatial regularization[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(7): 4098-4108. doi: 10.1109/TGRS.2018.2823419
|
[2] |
WEI W, ZHANG J, ZHANG L, et al. Deep cube-pair network for hyperspectral imagery classification[J]. Remote Sensing, 2018, 10(5): 1-18. doi: 10.3390/rs10050783
|
[3] |
MERENTITIS A, DEBES C, HEREMANS R. Ensemble learning in hyperspectral image classification: Toward selecting a favorable bias-variance tradeoff[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2014, 7(4): 1089-1102. doi: 10.1109/JSTARS.2013.2295513
|
[4] |
RASTI B, GHAMISI P, GLOAGUEN R. Hyperspectral and LiDAR fusion using extinction profiles and total variation component analysis[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3997-4007. doi: 10.1109/TGRS.2017.2686450
|
[5] |
曹琼, 马爱龙, 钟燕飞, 等. 高光谱-LiDAR 多级融合城区地表覆盖分类[J]. 遥感学报, 2019, 23(5): 892-903.
CAO Q, MA A L, ZHONG Y F, et al. Urban classification by multi-feature fusion of hyperspectral image and LiDAR data[J]. Journal of Remote Sensing, 2019, 23(5): 892-903(in Chinese).
|
[6] |
GE C, DU Q, LI W, et al. Hyperspectral and LiDAR data classification using kernel collaborative representation based residual fusion[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12(6): 1963-1973. doi: 10.1109/JSTARS.2019.2913206
|
[7] |
HANG R, LI Z, GHAMISI P, et al. Classification of hyperspectral and LiDAR data using coupled CNNs[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(7): 4939-4950. doi: 10.1109/TGRS.2020.2969024
|
[8] |
ZHANG T, XIAO S, DONG W, et al. A mutual guidance attention-based multi-level fusion network for hyperspectral and LiDAR classification[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
|
[9] |
LIAO W, PIŽURICA A, BELLENS R, et al. Generalized graph-based fusion of hyperspectral and LiDAR data using morphological features[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(3): 552-556. doi: 10.1109/LGRS.2014.2350263
|
[10] |
XUE Z, YU X, TAN X, et al. Multiscale deep learning network with self-calibrated convolution for hyperspectral and LiDAR data collaborative classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-16.
|
[11] |
ZHANG M, LI W, TAO R, et al. Information fusion for classification of hyperspectral and LiDAR data using IP-CNN[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-12.
|
[12] |
ZHAO X, TAO R, LI W, et al. Fractional Gabor convolutional network for multisource remote sensing data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-18.
|
[13] |
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]//Proceedings of the 9th Interbational Conference on Learning Representations. Schloss Dagstuhl: ICLR, 2021: 1-22.
|
[14] |
YANG B, BENDER G, LE Q V, et al. CondConv: Conditionally parameterized convolutions for efficient inference[C]//Proceedings of the 32nd International Conference on Neural Infomation Processing Systems. New York: ACM, 2019: 1-15.
|
[15] |
CHEN Y, DAI X, LIU M, et al. Dynamic convolution: Attention over convolution kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11027-11036.
|
[16] |
HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372
|
[17] |
LI W, WU G, ZHANG F, et al. Hyperspectral image classification using deep pixel-pair features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(2): 844-853.
|
[18] |
XU X, LI W, RAN Q, et al. Multisource remote sensing data classification based on convolutional neural network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(2): 937-949. doi: 10.1109/TGRS.2017.2756851
|
[19] |
YU Z, YU J, FAN J, et al. Multi-modal factorized bilinear pooling with co-attention learning for visual question answering[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1821-1830.
|