留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于CNN的多尺寸航拍图像定位方法的研究与实现

潘海侠 徐嘉璐 李锦涛 王赟豪 王华锋

潘海侠, 徐嘉璐, 李锦涛, 等 . 基于CNN的多尺寸航拍图像定位方法的研究与实现[J]. 北京航空航天大学学报, 2019, 45(11): 2170-2176. doi: 10.13700/j.bh.1001-5965.2019.0045
引用本文: 潘海侠, 徐嘉璐, 李锦涛, 等 . 基于CNN的多尺寸航拍图像定位方法的研究与实现[J]. 北京航空航天大学学报, 2019, 45(11): 2170-2176. doi: 10.13700/j.bh.1001-5965.2019.0045
PAN Haixia, XU Jialu, LI Jintao, et al. Research and implementation of multi-size aerial image positioning method based on CNN[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(11): 2170-2176. doi: 10.13700/j.bh.1001-5965.2019.0045(in Chinese)
Citation: PAN Haixia, XU Jialu, LI Jintao, et al. Research and implementation of multi-size aerial image positioning method based on CNN[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(11): 2170-2176. doi: 10.13700/j.bh.1001-5965.2019.0045(in Chinese)

基于CNN的多尺寸航拍图像定位方法的研究与实现

doi: 10.13700/j.bh.1001-5965.2019.0045
详细信息
    作者简介:

    潘海侠  女, 博士, 副教授。主要研究方向:人工智能、模式识别、计算机视觉与图像处理

    王华锋  男, 博士, 副教授。主要研究方向:图像识别、基于图像的测量

    通讯作者:

    王华锋.E-mail:wanghuafeng@buaa.edu.cn

  • 中图分类号: V221+.3; TB553

Research and implementation of multi-size aerial image positioning method based on CNN

More Information
  • 摘要:

    图像定位常用于无人机视觉导航,传统的无人机视觉导航广泛采用景象匹配导航方式,随着计算机技术的不断发展,深度学习技术为视觉导航的实现提供了新途径。以无人机的垂直侦查为背景,将飞行区域的航拍图像划分成大小相同的若干网格,每个网格代表一类区域,用网格图像制作数据集训练卷积神经网络(CNN)。基于AlexNet设计了一种融合显著性特征的全卷积网络模型,有效实现了一个基于CNN的多尺寸输入的滑动窗口分类器,并提出了一种邻域显著性参照定位策略来筛选分类结果,从而实现多尺寸航拍图像的定位。

     

  • 图 1  FCN滑动窗口形式

    Figure 1.  FCN sliding window

    图 2  航拍图像的显著性特征图

    Figure 2.  Saliency feature map of aerial image

    图 3  Multi-channel AlexNet-FCN示意图

    Figure 3.  Schematic diagram of multi-channel AlexNet-FCN

    图 4  三维曲面样例

    Figure 4.  3D surface sample diagram

    图 5  显著性权重示意图

    Figure 5.  Schematic diagram of saliency weight

    图 6  概率矩阵热力图

    Figure 6.  Heat map of probability matrix

    图 7  哈尔滨市的航拍图像

    Figure 7.  Aerial image of Harbin

    图 8  ROC曲线

    Figure 8.  ROC curves

    表  1  查询图像集准确率

    Table  1.   Query image set accuracy

    方法 输入 准确率/% 所用环境 平均运行时间/s
    AlexNet-FCN RGB 94.1 0.017
    multi-channel
    AlexNet-FCN
    RGB+显著特征
    RGB+HOG
    RGB+LBP
    95.4
    94.6
    94.2
    GPU: Titan X
    Pytorch
    0.022
    0.191
    0.034
    SIFT
    SURF
    ORB
    RGB 63.3
    73.6
    61.9
    CPU: E5-2670 37.865
    31.25
    44.586
    下载: 导出CSV

    表  2  航拍图像为512×512大小时top-k下的准确率

    Table  2.   Accuracy at top-k when aerial image size is 512×512

    是否使用邻域显著性参照定位 区域所占完整网格比例/% 准确率/%
    k=1 k=2 k=3 k=4
    100 67.3 39.5 26.6 19.9
    ≥80 81.2 68.1 49.3 38.0
    ≥60 83.9 74.1 56.0 43.6
    ≥40 85.1 76.0 58.3 45.5
    ≥20 85.6 77.1 59.8 46.7
    ≥0 86.1 77.9 61.1 47.9
    100 69.4 40.1 27.0 20.3
    ≥80 83.1 69.0 49.9 38.2
    ≥60 85.3 75.0 56.7 44.0
    ≥40 86.3 77.0 59.0 45.8
    ≥20 86.7 77.8 60.4 47.1
    ≥0 87.1 78.5 61.6 48.3
    下载: 导出CSV

    表  3  航拍图像为768×768大小时top-k下的准确率

    Table  3.   Accuracy at top-k when aerial image size is 768×768

    是否使用邻域显著性参照定位 区域所占完整网格比例/% 准确率/%
    k=1 k=2 k=3 k=4
    100 65.7 62.1 59.6 55.4
    ≥80 94.6 93.3 91.3 88.2
    ≥60 95.7 94.9 93.3 90.5
    ≥40 96.8 95.8 94.3 91.7
    ≥20 97.2 96.2 94.8 92.3
    ≥0 97.6 96.6 95.3 93.2
    100 68.8 65.8 57.5 51.0
    ≥80 95.7 94.2 89.2 84.3
    ≥60 96.6 95.7 91.3 87.4
    ≥40 97.6 96.5 92.4 88.8
    ≥20 97.8 96.7 92.8 89.6
    ≥0 98.1 97.1 93.5 90.0
    下载: 导出CSV

    表  4  航拍图像为512×512大小时top-k下的召回率

    Table  4.   Recall rate at top-k when aerial image size is 512×512

    是否使用邻域显著性参照定位 区域所占完整网格比例/% 召回率/%
    k=1 k=2 k=3 k=4
    100 67.3 79.1 79.8 79.8
    ≥80 46.1 72.3 76.9 78.2
    ≥60 29.7 51.3 57.1 58.9
    ≥40 20.0 35.8 41.3 43.0
    ≥20 14.8 26.8 31.6 33.1
    ≥0 8.6 15.5 18.2 19.0
    100 69.4 80.3 81.1 81.1
    ≥80 47.2 73.4 77.9 78.9
    ≥60 30.1 51.8 57.8 59.3
    ≥40 20.3 36.2 41.8 43.3
    ≥20 15.0 27.1 31.9 33.3
    ≥0 8.7 15.6 18.3 19.1
    下载: 导出CSV

    表  5  航拍图像为768×768大小时top-k下的召回率

    Table  5.   Recall rate at top-k when aerial image size is 768×768

    是否使用邻域显著性参照定位 区域所占完整网格比例/% 召回率/%
    k=1 k=2 k=3 k=4
    100 21.5 40.5 58.1 72.3
    ≥80 16.6 32.6 47.9 61.5
    ≥60 12.6 25.0 36.9 47.7
    ≥40 10.1 20.0 29.5 38.2
    ≥20 8.1 16.0 23.7 30.8
    ≥0 5.6 11.0 16.3 21.0
    100 22.6 43.1 61.5 75.1
    ≥80 16.8 32.9 48.5 62.3
    ≥60 12.8 25.3 37.3 48.1
    ≥40 10.2 20.1 29.7 38.5
    ≥20 8.2 16.1 23.8 31.0
    ≥0 5.6 11.1 16.4 21.3
    下载: 导出CSV
  • [1] PREWITT J M S.Object enhancement and extraction[M]//LIPKIN B S, ROSENFELD A.Picture processing and psychopictorics.Salt Lake City: Academic Press, 1970: 75-149.
    [2] MARR D, HILDRETH E.Theory of edge detection[J].Proceedings of the Royal Society of London, 1980, 207(1167):187-217. http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_849b7f238dda9761c6f2919ad1c04adf
    [3] CANNY J.A computational approach to edge detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, 8(6):679-698. http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_3d39d0b11988c5f90bf44b10d764f020
    [4] HARRIS C, STEPHENS M.A combined corner and edge detector[C]//4th Alvey Vision Conference, 1988: 117-151.
    [5] SMITHSM, BRADYM.SUSAN-Anewapproachtolowlevel image processing[J].Intemational Joumal of Computer Vision, 1997, 23(1):45-78. doi: 10.1023/A%3A1007963824710
    [6] MIKOLAJCZYK K, SCHMID C.An affine invariant interest point detector[C]//Proceedings of European Conference on Computer Vision.Berlin: Springer, 2002: 128-142. http://www.springerlink.com/content/d946wrklrq1fnpjh
    [7] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision, 2004, 60(2):91-110. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ025429678/
    [8] BAY H, TUVTELLARS T, VAN GOOL L.SURF: Speeded up robust features[C]//Proceedings of the European Conference on Computer Vision.Berlin: Springer, 2006: 404-417. doi: 10.1007/11744023_32
    [9] RUBLEE E, RABAUD V, KONOLIGE K, et al.ORB: An efficient alternative to SIFT or SURF[C]//IEEE International Conference on Computer Vision.Piscataway, NJ: IEEE Press, 2012: 2564-2571. http://www.researchgate.net/publication/221111151
    [10] HAUSDORFF F.Grundzüge der mengenlehre[M].Von Veit:Grundzüge der mengenlehre, 1914:A34-A35.
    [11] HUTTENLOCHER D P, KLANDERMAN G A, RUCKLIDGE W J.Comparing images using the Hausdorff distance[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(9):850-863. http://cn.bing.com/academic/profile?id=99ceaba6c7dae17907a35f4c081e639e&encoded=0&v=paper_preview&mkt=zh-cn
    [12] DUBUISSON M P, JAIN A K.A modified Hausdorff distance for object matching[C]//Proceedings of the 12th International Conference on Pattern Recognition.Piscataway, NJ: IEEE Press, 1994: 566-568. http://www.researchgate.net/publication/3684010_A_modified_Hausdorff_distance_for_object_matching
    [13] ZHAO C, SHI W, DENG Y.A new Hausdorff distance for image matching[J].Pattern Recognition Letters, 2005, 26(5):581-586. doi: 10.1016-j.patrec.2004.09.022/
    [14] BELONGIE S, MALIK J, PUZICHA J, et al.Shape context: A new descriptor for shape matching and object recognition[C]//Proceedings of the 13th International Conference on Neural Information Processing Systems.Cambridge: MIT Press, 2000: 831-837. http://www.researchgate.net/publication/2353535_Shape_Context_A_new_descriptor_for_shape_matching_and_object_recognition
    [15] BELONGIE S, MALIK J, PUZICHA J.Shape matching and object recognition using shape context[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(4):509-522. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0217363455/
    [16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2012, 60(6):84-90. http://cn.bing.com/academic/profile?id=1dc5d01904d2c274eaec2181a93aa339&encoded=0&v=paper_preview&mkt=zh-cn
    [17] SERMANET P, EIGEN D, ZHANG X, et al.OverFeat: Integrated recognition, localization and detection using convolutional networks[EB/OL].(2013-12-21)[2019-02-10].https: //arxiv.org/abs/1312.6229.
    [18] HOU X, HAREL J, KOCH C.Image signature:Highlighting sparse salient regions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(1):194-201. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0224793953/
    [19] OJALA T, PIETIKÄINEN M, MÄENPÄÄ T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7):971-987. http://cn.bing.com/academic/profile?id=bbab6ef5dd30d698066c963f5602e424&encoded=0&v=paper_preview&mkt=zh-cn
  • 加载中
图(8) / 表(5)
计量
  • 文章访问数:  761
  • HTML全文浏览量:  47
  • PDF下载量:  371
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-02-13
  • 录用日期:  2019-06-21
  • 网络出版日期:  2019-11-20

目录

    /

    返回文章
    返回
    常见问答