CHAI G Q,BO X S,LIU H J,et al. Self-supervised scene depth estimation for monocular images based on uncertainty[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(12):3780-3787 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0943
Citation: CHAI G Q,BO X S,LIU H J,et al. Self-supervised scene depth estimation for monocular images based on uncertainty[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(12):3780-3787 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0943

Self-supervised scene depth estimation for monocular images based on uncertainty

doi: 10.13700/j.bh.1001-5965.2022.0943
Funds:  National Natural Science Foundation of China (62201333,62001063); Basic Research Plan of Shanxi Province (20210302124647); Science and Technology Innovation Project of Colleges and Universities in Shanxi Province (2021L269)
More Information
  • Corresponding author: E-mail:haijun_liu@cqu.edu.cn
  • Received Date: 24 Nov 2022
  • Accepted Date: 17 Mar 2023
  • Available Online: 31 Mar 2023
  • Publish Date: 27 Mar 2023
  • Depth information plays an important role in accurately understanding the three-dimensional scene structure and the three-dimensioual relationship between objects in images. An end-to-end self-supervised depth estimation algorithm based on uncertainty for monocular images was proposed in this paper by combining structure-from-motion, image reprojection, and uncertainty theory. The depth map of the target image was obtained by the encoder-decoder depth estimation network based on an improved densely connected module, and the transformation matrix of camera positions for shooting the target image and source image was calculated by the pose estimation network. Then, the source image was sampled pixel by pixel according to the image reprojection to obtain the reconstructed target image. The proposed algorithm was optimized by the reconstructed objective function, uncertain objective function, and smooth objective function, and the self-supervised depth information estimation was realized by minimizing the difference between the reconstructed image and the real target image. Experimental results show that the proposed algorithm achieves better depth estimation effects than the mainstream algorithms such as competitive collaboration estimation algorithm (CC), Monodepth2, and Hr-depth in terms of both objective indicators and subjective visual comparison.

     

  • [1]
    李宏刚, 王云鹏, 廖亚萍, 等. 无人驾驶矿用运输车辆感知及控制方法[J]. 北京航空航天大学学报, 2019, 45(11): 2335-2344.

    LI H G, WANG Y P, LIAO Y P, et al. Perception and control method of driverless mining vehicle[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(11): 2335-2344(in Chinese).
    [2]
    CHENG Z Y, ZHANG Y, TANG C K. Swin-depth: Using transformers and multi-scale fusion for monocular-based depth estimation[J]. IEEE Sensors Journal, 2021, 21(23): 26912-26920. doi: 10.1109/JSEN.2021.3120753
    [3]
    IZADINIA H, SHAN Q, SEITZ S M. IM2CAD[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2422-2431.
    [4]
    ZHANG Y Y, XIONG Z W, YANG Z, et al. Real-time scalable depth sensing with hybrid structured light illumination[J]. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2014, 23(1): 97-109. doi: 10.1109/TIP.2013.2286901
    [5]
    LEE J, KIM Y, LEE S, et al. High-quality depth estimation using an exemplar 3D model for stereo conversion[J]. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(7): 835-847. doi: 10.1109/TVCG.2015.2398440
    [6]
    邓慧萍, 盛志超, 向森, 等. 基于语义导向的光场图像深度估计[J]. 电子与信息学报, 2022, 44(8): 2940-2948.

    DENG H P, SHENG Z C, XIANG S, et al. Depth estimation based on semantic guidance for light field image[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2940-2948(in Chinese).
    [7]
    ZHANG J, CAO Y, ZHA Z J, et al. A unified scheme for super-resolution and depth estimation from asymmetric stereoscopic video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(3): 479-493. doi: 10.1109/TCSVT.2014.2367356
    [8]
    YANG J Y, ALVAREZ J M, LIU M M. Self-supervised learning of depth inference for multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 7522-7530.
    [9]
    FU H, GONG M M, WANG C H, et al. Deep ordinal regression network for monocular depth estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2002-2011.
    [10]
    UMMENHOFER B, ZHOU H Z, UHRIG J, et al. DeMoN: Depth and motion network for learning monocular stereo[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5622-5631.
    [11]
    KENDALL A, MARTIROSYAN H, DASGUPTA S, et al. End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 66-75.
    [12]
    HAMBARDE P, MURALA S. S2DNet: Depth estimation from single image and sparse samples[J]. IEEE Transactions on Computational Imaging, 2020, 6: 806-817. doi: 10.1109/TCI.2020.2981761
    [13]
    BADKI A, TROCCOLI A, KIM K, et al. Bi3D: Stereo depth estimation via binary classifications[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 1597-1605.
    [14]
    DU Q C, LIU R K, PAN Y, et al. Depth estimation with multi-resolution stereo matching[C]//Proceedings of the IEEE Visual Communications and Image Processing. Piscataway: IEEE Press, 2019: 1-4.
    [15]
    JOHNSTON A, CARNEIRO G. Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4755-4764.
    [16]
    SONG M, LIM S, KIM W. Monocular depth estimation using Laplacian pyramid-based depth residuals[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(11): 4381-4393. doi: 10.1109/TCSVT.2021.3049869
    [17]
    RANJAN A, JAMPANI V, BALLES L, et al. Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 12232-12241.
    [18]
    GODARD C, MAC AODHA O, FIRMAN M, et al. Digging into self-supervised monocular depth estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 3827-3837.
    [19]
    ZHOU T H, BROWN M, SNAVELY N, et al. Unsupervised learning of depth and ego-motion from video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6612-6619.
    [20]
    LI K H, FU Z H, WANG H Y, et al. Adv-depth: Self-supervised monocular depth estimation with an adversarial loss[J]. IEEE Signal Processing Letters, 2021, 28: 638-642. doi: 10.1109/LSP.2021.3065203
    [21]
    ZOU Y L, JI P, TRAN Q H, et al. Learning monocular visual odometry via self-supervised long-term modeling[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 710-727.
    [22]
    LYU X Y, LIU L, WANG M M, et al. HR-depth: High resolution self-supervised monocular depth estimation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 35(3): 2294-2301.
    [23]
    WAN Y C, ZHAO Q K, GUO C, et al. Multi-sensor fusion self-supervised deep odometry and depth estimation[J]. Remote Sensing, 2022, 14(5): 1228. doi: 10.3390/rs14051228
    [24]
    MAHJOURIAN R, WICKE M, ANGELOVA A. Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 5667-5675.
    [25]
    刘晓旻, 杜梦珠, 马治邦, 等. 基于遮挡场景的光场图像深度估计方法[J]. 光学学报, 2020, 40(5): 0510002. doi: 10.3788/AOS202040.0510002

    LIU X M, DU M Z, MA Z B, et al. Depth estimation method of light field image based on occlusion scene[J]. Acta Optica Sinica, 2020, 40(5): 0510002(in Chinese). doi: 10.3788/AOS202040.0510002
    [26]
    YIN Z C, SHI J P. GeoNet: Unsupervised learning of dense depth, optical flow and camera pose[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 1983-1992.
    [27]
    KONG C, LUCEY S. Deep non-rigid structure from motion with missing data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4365-4377. doi: 10.1109/TPAMI.2020.2997026
    [28]
    HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2261-2269.
    [29]
    WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2018: 1451-1460.
    [30]
    KENDALL A, GAL Y. What uncertainties do we need in Bayesian deep learning for computer vision[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. California: NIPS, 2017: 5580-5590.
    [31]
    GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: The KITTI dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237. doi: 10.1177/0278364913491297
  • Relative Articles

    [1]ZHANG K,LIU Y,HU K. RAW image reconstruction based on multi-scale attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(1):257-264 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0959.
    [2]LI K,SHEN Z G,ZHANG X J. Study on uncertainties of graphene tag antenna by screen printing[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(3):857-864 (in Chinese). doi: 10.13700/j.bh.1001-5965.2023.0159.
    [3]WANG Zhi-hui, XIANG Zhi-ning, GAO Ping. Research on Uncertainty in Kill Effectiveness of Anti-Ship Ballistic Missiles[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0774
    [4]ZHAO Z B,MA D Y,DING J T,et al. Weak supervision detection method for pin-missing bolts of transmission lines based on SAW-PCL[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3319-3326 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0832.
    [5]ZOU Y B,LI T,CHEN M,et al. Indoor spatial layout estimation model based on multi-task supervised learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3327-3337 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0834.
    [6]TIAN M Y,SHEN Z J. Trajectory planning of re-entry gliding vehicle in a class of uncertain environment[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(8):2514-2523 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0640.
    [7]HU Jianping, GAO Zhipeng, MOU Yang, XIE Qi. Deep learning-based image watermarking combining attack classification and multi-channel embedding[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0552
    [8]YANG B,WEI X,YU H,et al. Martian terrain feature extraction method based on unsupervised contrastive learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(6):1842-1849 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0525.
    [9]LIANG Zhen-feng, XIA Hai-ying, TAN Yu-mei, SONG Shu-xiang. Aerial Image Stitching Algorithm Based on Unsupervised Deep Learning[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0366
    [10]PANG F Q,ZHAO H F,KANG Y Y. Uncertainty estimation fused end-to-end video event detection algorithm[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(12):3759-3770 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0897.
    [11]LU G,ZHONG T X,GENG J. A Transformer based deep conditional video compression[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(2):442-448 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0374.
    [12]ZHANG Dongdong, WANG Chunping, FU Qiang. FastSAM-assisted representation enhancement for self-supervised monocular depth estimation[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0846
    [13]HOU Z Q,MA J Y,HAN R X,et al. A fast long-term visual tracking algorithm based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(8):2391-2403 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0645.
    [14]WANG Guang-han, SONG Chen, YANG Chao. Influence of airfoil uncertainty on aerodynamic characteristics and shape inspection method[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0647
    [15]DONG X X,YUE Z J,WANG Z,et al. Uncertainty lightweight design of sandwich structure of rocket fairing cone[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(3):625-635 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0267.
    [16]WANG Zai-sheng, WANG Xiao-feng, SHEN Guo-dong, ZHANG Zeng-jie, QUAN Da-ying. Self-Supervised Learning for Community Detection Based on Deep Graph Convolutional Networks[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0408
    [17]CHENG De-qiang, FAN Shu-ming, QIAN Jian-sheng, JIANG He, KOU Qi-qi. Coordinate-aware Attention-Based Multi-Frame Self-Supervised Monocular Depth Estimation[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0417
    [18]WANG Y G,YAO S Z,TAN H B. Residual SDE-Net for uncertainty estimates of deep neural networks[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(8):1991-2000 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0604.
    [19]SUN X T,CHENG W,CHEN W J,et al. A visual detection and grasping method based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(10):2635-2644 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0130.
    [20]ZHANG Wei, WANG Qiang, LU Jiachen, YAN Chao. Robust optimization design under geometric uncertainty based on PCA-HicksHenne method[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(12): 2473-2481. doi: 10.13700/j.bh.1001-5965.2021.0142
  • Created with Highcharts 5.0.7Amount of accessChart context menuAbstract Views, HTML Views, PDF Downloads StatisticsAbstract ViewsHTML ViewsPDF Downloads2024-052024-062024-072024-082024-092024-102024-112024-122025-012025-022025-032025-04010203040
    Created with Highcharts 5.0.7Chart context menuAccess Class DistributionFULLTEXT: 30.0 %FULLTEXT: 30.0 %META: 67.2 %META: 67.2 %PDF: 2.9 %PDF: 2.9 %FULLTEXTMETAPDF
    Created with Highcharts 5.0.7Chart context menuAccess Area Distribution其他: 8.8 %其他: 8.8 %Matale: 0.4 %Matale: 0.4 %上海: 1.9 %上海: 1.9 %东京: 0.4 %东京: 0.4 %北京: 2.1 %北京: 2.1 %十堰: 0.4 %十堰: 0.4 %南京: 0.4 %南京: 0.4 %南昌: 0.2 %南昌: 0.2 %台北: 0.2 %台北: 0.2 %合肥: 0.2 %合肥: 0.2 %哥伦布: 0.4 %哥伦布: 0.4 %大连: 0.4 %大连: 0.4 %天津: 1.1 %天津: 1.1 %太原: 1.0 %太原: 1.0 %安顺: 0.2 %安顺: 0.2 %常州: 0.4 %常州: 0.4 %常德: 0.2 %常德: 0.2 %广安: 0.4 %广安: 0.4 %张家口: 1.5 %张家口: 1.5 %成都: 0.2 %成都: 0.2 %扬州: 0.8 %扬州: 0.8 %晋中: 0.6 %晋中: 0.6 %朝阳: 0.2 %朝阳: 0.2 %杭州: 0.6 %杭州: 0.6 %榆林: 0.8 %榆林: 0.8 %淄博: 0.2 %淄博: 0.2 %深圳: 9.0 %深圳: 9.0 %温州: 0.4 %温州: 0.4 %漯河: 2.1 %漯河: 2.1 %石家庄: 0.6 %石家庄: 0.6 %秦皇岛: 0.2 %秦皇岛: 0.2 %芒廷维尤: 8.6 %芒廷维尤: 8.6 %芝加哥: 0.6 %芝加哥: 0.6 %衡水: 0.2 %衡水: 0.2 %西宁: 35.5 %西宁: 35.5 %西安: 0.8 %西安: 0.8 %诺沃克: 2.3 %诺沃克: 2.3 %运城: 0.4 %运城: 0.4 %邯郸: 0.2 %邯郸: 0.2 %郑州: 15.1 %郑州: 15.1 %重庆: 0.2 %重庆: 0.2 %长沙: 0.2 %长沙: 0.2 %阳泉: 0.2 %阳泉: 0.2 %其他Matale上海东京北京十堰南京南昌台北合肥哥伦布大连天津太原安顺常州常德广安张家口成都扬州晋中朝阳杭州榆林淄博深圳温州漯河石家庄秦皇岛芒廷维尤芝加哥衡水西宁西安诺沃克运城邯郸郑州重庆长沙阳泉

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(3)

    Article Metrics

    Article views(351) PDF downloads(15) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return