留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于三元组网络的单图三维模型检索

杜雨佳 李海生 姚春莲 蔡强

杜雨佳, 李海生, 姚春莲, 等 . 基于三元组网络的单图三维模型检索[J]. 北京航空航天大学学报, 2020, 46(9): 1691-1700. doi: 10.13700/j.bh.1001-5965.2020.0057
引用本文: 杜雨佳, 李海生, 姚春莲, 等 . 基于三元组网络的单图三维模型检索[J]. 北京航空航天大学学报, 2020, 46(9): 1691-1700. doi: 10.13700/j.bh.1001-5965.2020.0057
DU Yujia, LI Haisheng, YAO Chunlian, et al. Monocular image based 3D model retrieval using triplet network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1691-1700. doi: 10.13700/j.bh.1001-5965.2020.0057(in Chinese)
Citation: DU Yujia, LI Haisheng, YAO Chunlian, et al. Monocular image based 3D model retrieval using triplet network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1691-1700. doi: 10.13700/j.bh.1001-5965.2020.0057(in Chinese)

基于三元组网络的单图三维模型检索

doi: 10.13700/j.bh.1001-5965.2020.0057
基金项目: 

国家自然科学基金 61877002

北京市自然科学基金-丰台轨道交通前沿研究联合基金 L191009

北京市教委科研团队建设项目 PXM2019_014213_000007

详细信息
    作者简介:

    杜雨佳  女, 硕士研究生。主要研究方向:计算机图形学

    姚春莲  女, 博士, 副教授。主要研究方向:视频、图像处理、嵌入式系统设计

    蔡强  男, 博士, 教授, 博士生导师。主要研究方向:计算机图形学

    通讯作者:

    李海生, E-mail: lihsh@btbu.edu.cn

  • 中图分类号: TP183;TP391

Monocular image based 3D model retrieval using triplet network

Funds: 

National Natural Science Foundation of China 61877002

Beijing Natural Science Foundation and Fengtai Rail Transit Frontier Research Joint Fund L191009

Beijing Municipal Education Commission Research Team Construction Project PXM2019_014213_000007

More Information
    Corresponding author: LI Haisheng, E-mail: lihsh@btbu.edu.cn
  • 摘要:

    随着媒体数据的多样化发展,联合图像与三维模型的跨域检索成为三维模型检索问题的一个新挑战。针对图像与三维模型差异大、难匹配问题,提出了一种基于三元组网络的跨域数据检索方法。以端到端的方式构建真实图像与三维模型的特征联合嵌入空间,通过特征间的距离度量不同模态数据之间的相似性,实现从单张图像检索相似的三维模型。为了提高跨域检索准确度,将三维模型用一组顺序视图表示,结合门控循环单元(GRU)聚合视图级特征,同时引入注意力机制提取图像特征,缩小真实图像与投影视图间的语义差异。实验结果表明:相比于同类方法,所提方法在两个跨域数据集上的检索平均准确率至少提升2.98%~3.05%。

     

  • 图 1  跨域检索三元组网络架构

    Figure 1.  Architecture of cross-domain retrieval triplet network

    图 2  注意力模块详细结构

    Figure 2.  Detailed structure of attention module

    图 3  GRU网络聚合视图级特征

    Figure 3.  Aggregation of view-level features using GRU networks

    图 4  基于单张图像的三维模型检索结果示例

    Figure 4.  Examples of monocular image based 3D model retrieval results

    表  1  IM2MN数据集消融实验测试结果

    Table  1.   Test results of ablation experiment onIM2MN dataset

    自适应层 注意力模块 GRU mAP/%
    42.16
    48.74
    51.93
    54.48
    55.65
    下载: 导出CSV

    表  2  MI3DOR数据集消融实验测试结果

    Table  2.   Test results of ablation experiment onMI3DOR dataset

    自适应层 注意力模块 GRU mAP/%
    42.78
    49.67
    53.75
    55.24
    56.53
    下载: 导出CSV

    表  3  基于图像的三维模型检索性能

    Table  3.   Performance for image-based 3D model retrieval

    数据集 方法 mAP/%
    IM2MN MVCNN[29]
    三元组+MVCNN[14]
    CDTNN[14]
    本文
    7.92
    40.85
    52.67
    55.65
    MI3DOR CDTNN[14]
    本文
    53.48
    56.53
    下载: 导出CSV
  • [1] BU S H, WANG L, HAN P C, et al.3D shape recognition and retrieval based on multi-modality deep learning[J].Neurocomputing, 2017, 259:183-193. doi: 10.1016/j.neucom.2016.06.088
    [2] 蔡轶珩, 王雪艳, 胡绍斌, 等.基于多源图像弱监督学习的3D人体姿态估计[J].北京航空航天大学学报, 2019, 45(12):2375-2384. doi: 10.13700/j.bh.1001-5965.2019.0387

    CAI Y H, WANG X Y, HU S B, et al.Three-dimensional human pose estimation based on multi-source image weakly-supervised learning[J].Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(12):2375-2384(in Chinese). doi: 10.13700/j.bh.1001-5965.2019.0387
    [3] GIRDHAR R, FOUHEY D F, RODRIGUEZ M, et al.Learning a predictable and generative vector representation for objects[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 484-499.
    [4] TULSIANI S, GUPTA S, FOUHEY D F, et al.Factoring shape, pose, and layout from the 2d image of a 3d scene[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 302-310.
    [5] IYER N, JAYANTI S, LOU K, et al.Three-dimensional shape searching:State-of-the-art review and future trends[J].Computer-Aided Design, 2005, 37(5):509-530. http://www.sciencedirect.com/science/article/pii/S001044850400140X
    [6] XIE J, FANG Y, ZHU F, et al.Deepshape: Deep learned shape descriptor for 3d shape matching and retrieval[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 1275-1283.
    [7] MIAN A S, BENNAMOUN M, OWENS R A.Matching tensors for pose invariant automatic 3D face recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2005: 120.
    [8] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.Cambridge: MIT Press, 2012: 1097-1105.
    [9] 杨思晨, 王华锋, 王月海, 等.深度学习机制与小波融合的超分辨率重建算法[J].北京航空航天大学学报, 2020, 46(1):189-197. doi: 10.13700/j.bh.1001-5965.2019.0146

    YANG S C, WANG H F, WANG Y H, et al.Super-resolution reconstructing algorithm based on deep learning mechanism and wavelet fusion[J].Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(1):189-197(in Chinese). doi: 10.13700/j.bh.1001-5965.2019.0146
    [10] GRABNER A, ROTH P M, LEPETIT V.Location field descriptors: Single image 3D model retrieval in the wild[C]//Proceedings of the 2019 International Conference on 3D Vision (3DV).Piscataway: IEEE Press, 2019: 583-593.
    [11] WU Z Z, ZHANG Y H, ZENG M, et al.Joint analysis of shapes and images via deep domain adaptation[J].Computers & Graphics, 2018, 70:140-147. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=464c4602d6b160c53d6eb7368840f7a7
    [12] WOO S, PARK J, LEE J Y, et al.CBAM: Convolutional block attention module[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 3-19.
    [13] CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].(2014-06-03)[2020-02-25].https://arxiv.org/abs/1406.1078.
    [14] LEE T, LIN Y L, CHIANG H Y, et al.Cross-domain image-based 3D shape retrieval by view sequence learning[C]//Proceedings of the 2018 International Conference on 3D Vision (3DV).Piscataway: IEEE Press, 2018: 258-266.
    [15] LI W, LIU A, NIE W Z, et al.SHREC 2019-Monocular image based 3D model retrieval[EB/OL].(2019-01-28)[2020-02-25].https://www.iti-tju.org/MI3DOR19/.
    [16] FANG Y, XIE J, DAI G, et al.3D deep shape descriptor[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 2319-2328.
    [17] 李海生, 武玉娟, 郑艳萍, 等.基于深度学习的三维数据分析理解方法研究综述[J/OL].计算机学报, 2019, 42: 1-25.(2019-07-09)[2020-02-21].http://kns.cnki.net/kcms/detail/11.1826.TP.20190709.1509.002.html.

    LI H S, WU Y J, ZHENG Y P, et al.A survey of 3D data analysis and understanding based on deep learning[J/OL].Chinese Journal of Computers, 2019, 42: 1-25.(2019-07-09)[2020-02-21].http://kns.cnki.net/kcms/detail/11.1826.TP.20190709.1509.002.html(in Chinese).
    [18] OSADA R, FUNKHOUSER T, CHAZELLE B, et al.Shape distributions[J].ACM Transactions on Graphics (TOG), 2002, 21(4):807-832. doi: 10.1145/571647.571648
    [19] MAHMOUDI M, SAPIRO G.Three-dimensional point cloud recognition via distributions of geometric distances[J].Graphical Models, 2009, 71(1):22-31. http://www.sciencedirect.com/science/article/pii/S1524070308000313
    [20] SUN J, OVSJANIKOV M, GUIBAS L.A concise and provably informative multi-scale signature based on heat diffusion[J].Computer Graphics Forum, 2009, 28(5):1383-1392. doi: 10.1111/j.1467-8659.2009.01515.x
    [21] AUBRY M, SCHLICKEWEI U, CREMERS D.The wave kernel signature: A quantum mechanical approach to shape analysis[C]//Proceedings of 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).Piscataway: IEEE Press, 2011: 1626-1633.
    [22] WANG P S, SUN C Y, LIU Y, et al.Adaptive O-CNN:A patch-based deep representation of 3D shapes[J].ACM Transactions on Graphics (TOG), 2018, 37(6):1-11. http://arxiv.org/abs/1809.07917
    [23] FENG Y, FENG Y, YOU H, et al.MeshNet: Mesh neural network for 3D shape representation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto: AAAI Press, 2019, 33: 8279-8286.
    [24] QI C R, YI L, SU H, et al.Pointnet++: Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems.Cambridge: MIT Press, 2017: 5099-5108.
    [25] HUANG H B, KALOGERAKIS E, CHAUDHURI S, et al.Learning local shape descriptors from part correspondences with multiview convolutional networks[J].ACM Transactions on Graphics (TOG), 2017, 37(1):1-14. doi: 10.1145/3137609
    [26] WANG P S, LIU Y, GUO Y X, et al.O-CNN:Octree-based convolutional neural networks for 3d shape analysis[J].ACM Transactions on Graphics (TOG), 2017, 36(4):1-11. http://dl.acm.org/citation.cfm?id=3073608
    [27] HAN Z, SHANG M, LIU Z, et al.SeqViews2SeqLabels:Learning 3D global features via aggregating sequential views by RNN with attention[J].IEEE Transactions on Image Processing, 2018, 28(2):658-672. http://ieeexplore.ieee.org/document/8453813/
    [28] LAN S Y, YU R C, YU G, et al.Modeling local geometric structure of 3D point clouds using Geo-CNN[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 998-1008.
    [29] SU H, MAJI S, KALOGERAKIS E, et al.Multi-view convolutional neural networks for 3d shape recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2015: 945-953.
    [30] FENG Y, ZHANG Z, ZHAO X, et al.GVCNN: Group-view convolutional neural networks for 3D shape recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 264-272.
    [31] JIANG J, BAO D, CHEN Z, et al.MLVCNN: Multi-loop-view convolutional neural network for 3D shape retrieval[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto: AAAI Press, 2019, 33: 8513-8520.
    [32] TASSE F P, DODGSON N.Shape2Vec:Semantic-based descriptors for 3D shapes, sketches and images[J].ACM Transactions on Graphics (TOG), 2016, 35(6):1-12. http://www.zhangqiaokeyan.com/academic-journal-foreign_other_thesis/0204110052953.html
    [33] AUBRY M, MATURANA D, EFROS A A, et al.Seeing 3d chairs: Exemplar part-based 2d-3d alignment using a large dataset of cad models[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2014: 3762-3769.
    [34] MOTTAGHI R, XIANG Y, SAVARESE S.A coarse-to-fine model for 3d pose estimation and sub-category recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 418-426.
    [35] KULIS B.Metric learning:A survey[J].Foundations and Trendsin Machine Learning, 2013, 5(4):287-364. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?bkn=8186753
    [36] XIANG Y, KIM W, CHEN W, et al.ObjectNet3D: A large scale database for 3d object recognition[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 160-176.
    [37] WANG F, KANG L, LI Y.Sketch-based 3d shape retrieval using convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 1875-1883.
    [38] LI Y Y, SU H, QI C R, et al.Joint embeddings of shapes and images via CNN image purification[J].ACM Transactions on Graphics (TOG), 2015, 34(6):1-12. http://dl.acm.org/citation.cfm?id=2818071
    [39] DAI G, XIE J, ZHU F, et al.Deep correlated metric learning for sketch-based 3d shape retrieval[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto: AAAI Press, 2017: 4002-4008.
    [40] DAI G X, XIE J, FANG Y.Deep correlated holistic metric learning for sketch-based 3d shape retrieval[J].IEEE Transactions on Image Processing, 2018, 27(7):3374-3386. doi: 10.1109/TIP.2018.2817042
    [41] SCHROFF F, KALENICHENKO D, PHILBIN J.FaceNet: Aunified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 815-823.
    [42] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2014-09-04)[2020-02-25].https://arxiv.org/abs/1409.1556.
  • 加载中
图(4) / 表(3)
计量
  • 文章访问数:  396
  • HTML全文浏览量:  29
  • PDF下载量:  98
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-02-28
  • 录用日期:  2020-03-28
  • 刊出日期:  2020-09-20

目录

    /

    返回文章
    返回
    常见问答