留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的视觉检测及抓取方法

孙先涛 程伟 陈文杰 方笑晗 陈伟海 杨茵鸣

孙先涛,程伟,陈文杰,等. 基于深度学习的视觉检测及抓取方法[J]. 北京航空航天大学学报,2023,49(10):2635-2644 doi: 10.13700/j.bh.1001-5965.2022.0130
引用本文: 孙先涛,程伟,陈文杰,等. 基于深度学习的视觉检测及抓取方法[J]. 北京航空航天大学学报,2023,49(10):2635-2644 doi: 10.13700/j.bh.1001-5965.2022.0130
SUN X T,CHENG W,CHEN W J,et al. A visual detection and grasping method based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(10):2635-2644 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0130
Citation: SUN X T,CHENG W,CHEN W J,et al. A visual detection and grasping method based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(10):2635-2644 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0130

基于深度学习的视觉检测及抓取方法

doi: 10.13700/j.bh.1001-5965.2022.0130
基金项目: 国家自然科学基金(52005001)
详细信息
    通讯作者:

    E-mail:whchen@buaa.edu.cn

  • 中图分类号: TP242

A visual detection and grasping method based on deep learning

Funds: National Natural Science Foundation of China (52005001)
More Information
  • 摘要:

    针对现有机器人抓取系统对硬件设备要求高、难以适应不同物体及抓取过程产生较大有害扭矩等问题,提出一种基于深度学习的视觉检测及抓取方法。采用通道注意力机制对YOLO-V3进行改进,增强网络对图像特征提取的能力,提升复杂环境中目标检测的效果,平均识别率较改进前增加0.32%。针对目前姿态估计角度存在离散性的问题,提出一种基于视觉几何组-16(VGG-16)主干网络嵌入最小面积外接矩形(MABR)算法,进行抓取位姿估计和角度优化。改进后的抓取角度与目标实际角度平均误差小于2.47°,大大降低两指机械手在抓取过程中对物体所额外施加的有害扭矩。利用UR5机械臂、气动两指机械手、Realsense D435相机及ATI-Mini45六维力传感器等设备搭建了一套视觉抓取系统,实验表明:所提方法可以有效地对不同物体进行抓取分类操作、对硬件的要求较低、并且将有害扭矩降低约75%,从而减小对物体的损害,具有很好的应用前景。

     

  • 图 1  抓取系统图

    Figure 1.  Grasping system diagram

    图 2  YOLO-V3算法结构

    Figure 2.  Structure of YOLO-V3 algorithm

    图 3  通道注意力机制模块

    Figure 3.  Channel attention mechanism module

    图 4  五维抓取框图

    Figure 4.  Five-dimensional grasping frame diagram

    图 5  抓取位姿估计网络结构

    Figure 5.  Structure of grasping pose estimation

    图 6  角度识别原理图

    Figure 6.  Angle recognition schematic diagram

    图 7  目标检测类别图

    Figure 7.  Target detection category diagram

    图 8  目标检测结果示意图

    Figure 8.  Digram of target detection results

    图 9  抓取位姿估计结果

    Figure 9.  Grasping pose estimation results

    图 10  系统样机搭建

    Figure 10.  System prototype setup

    图 11  抓取实验

    Figure 11.  Grasping experiments

    图 12  抓取扭矩图

    Figure 12.  Grasping torsion diagram

    表  1  不同网络结构对比

    Table  1.   Comparison of different network structure

    网络结构准确率/%运行时间/s
    cornell数据集实验目标
    双层网络结构ResNet-5091.3087.110.932
    单层网络结构ResNet-5091.1286.690.714
    单层网络结构
    VGG-16
    90.8987.190.286
    下载: 导出CSV

    表  2  位姿估计结果

    Table  2.   Pose estimation results

    目标目标抓取点(u, v)/像素目标抓取角度/(°)目标实际角度/(°)
    改进前改进后改进前改进后
    control board(107, 112.2) (107, 112.2)100124123
    hammer(92.3, 109.3) (92.3, 109.3)301823
    shovel(111.3, 108.5) (111.3, 108.5)506259
    wrench(87.5, 132.5) (87.5, 132.5)404644
    scissors(104.5, 118.3) (104.5, 118.3)505354
    pliers(88.1, 114.5) (88.1, 114.5)404852
    umbrella(88.5, 98.4) (88.5, 98.4)303535
    weight counter(100.5, 136.2) (100.5, 136.2)90135127
    stapler(106.9, 104.7) (106.9, 104.7)304648
    solid glue(98.2, 120.9) (98.2, 120.9)404545
    screwdriver(83.6, 110.2) (83.6, 110.2)130161162
    sponge(84.6, 118.7) (84.6, 118.7)1801314
    下载: 导出CSV

    表  3  抓取实验数据

    Table  3.   Experimental data of grasping

    编号目标抓取点(x, y, z)/mm目标抓取角度/(°)目标实际角度/(°)抓取扭矩/(N·mm)
    改进前改进后改进前改进后改进前改进后
    实验1(153.41, −675.29, 102.35) (153.41, −675.29, 102.35)5052534.02.3
    实验2(13122, −603.70, 99.16) (13122, −603.70, 99.16)5058589.50.3
    实验3(161.96, −558.44, 102.71) (161.96, −558.44, 102.71)14015715615.02.6
    实验4(111.15, −574.50, 98.79) (111.15, −574.50, 98.79)1021198.05.0
    实验5(114.63, −732.19, 96.96) (114.63, −732.19, 96.96)30394719.011.0
    实验6(102.68, −657.68, 100.41) (102.68, −657.68, 100.41)40515010.62.5
    实验7(127.41, −675.63, 100.53) (127.41, −675.63, 100.53)4046454.01.5
    实验8(155.39, −597.67, 105.50) (155.39, −597.67, 105.50)5053558.03.8
    实验9(176.65, −690.90, 103.57) (176.65, −690.90, 103.57)9011111317.54.0
    实验10(131.77, −739.27, 100.34) (131.77, −739.27, 100.34)10011211212.50
    实验11(194.20, −687.68, 101.49) (194.20, −687.68, 101.49)3036355.02.5
    实验12(127.47, −590.49, 100.38) (127.47, −590.49, 100.38)30636325.00
    下载: 导出CSV
  • [1] DU G G, WANG K, LIAN S G, et al. Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: A review[J]. Artificial Intelligence Review, 2021, 54(3): 1677-1734. doi: 10.1007/s10462-020-09888-5
    [2] 翟敬梅, 董鹏飞, 张铁. 基于视觉引导的工业机器人定位抓取系统设计[J]. 机械设计与研究, 2014, 30(5): 45-49.

    ZHAI J M, DONG P F, ZHANG T. Positioning and grasping system design of industrial robot based on visual guidance[J]. Machine Design & Research, 2014, 30(5): 45-49(in Chinese).
    [3] WEI H, PAN S C, MA G, et al. Vision-guided hand–eye coordination for robotic grasping and its application in tangram puzzles[J]. AI, 2021, 2(2): 209-228. doi: 10.3390/ai2020013
    [4] MALLICK A, DEL POBIL A P, CERVERA E. Deep learning based object recognition for robot picking task[C]// Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication. New York: ACM, 2018: 1-9.
    [5] 白成超, 晏卓, 宋俊霖. 结合深度学习的机械臂视觉抓取控制[J]. 载人航天, 2018, 24(3): 299-307.

    BAI C C, YAN Z, SONG J L. Visual grasp control of robotic arm based on deep learning[J]. Manned Spaceflight, 2018, 24(3): 299-307(in Chinese).
    [6] 黄怡蒙, 易阳. 融合深度学习的机器人目标检测与定位[J]. 计算机工程与应用, 2020, 56(24): 181-187.

    HUANG Y M, YI Y. Robot object detection and localization based on deep learning[J]. Computer Engineering and Applications, 2020, 56(24): 181-187(in Chinese).
    [7] JIANG Y, MOSESON S, SAXENA A. Efficient grasping from RGBD images: Learning using a new rectangle representation[C]// 2011 IEEE International Conference on Robotics and Automation. Piscataway: IEEE Press, 2011: 3304-3311.
    [8] CHU F J, XU R N, VELA P A. Real-world multiobject, multigrasp detection[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 3355-3362. doi: 10.1109/LRA.2018.2852777
    [9] 夏浩宇, 索双富, 王洋, 等. 基于Keypoint RCNN改进模型的物体抓取检测算法[J]. 仪器仪表学报, 2021, 42(4): 236-246.

    XIA H Y, SUO S F, WANG Y, et al. Object grasp detection algorithm based on improved Keypoint RCNN model[J]. Chinese Journal of Scientific Instrument, 2021, 42(4): 236-246(in Chinese).
    [10] ZHANG Z Y. Flexible camera calibration by viewing a plane from unknown orientations[C]// Proceedings of the Seventh IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2002: 666-673.
    [11] REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. (2018-04-08) [2022-03-08]. https://arxiv.org/abs/1804.02767
    [12] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10) [2022-03-08]. https://arxiv.org/abs/1409.1556
    [13] SONG R, LI F M, FU T Y, et al. A robotic automatic assembly system based on vision[J]. Applied Sciences, 2020, 10(3): 1157. doi: 10.3390/app10031157
    [14] 尹宏鹏, 陈波, 柴毅, 等. 基于视觉的目标检测与跟踪综述[J]. 自动化学报, 2016, 42(10): 1466-1489.

    YIN H P, CHEN B, CHAI Y, et al. Vision-based object detection and tracking: a review[J]. Acta Automatica Sinica, 2016, 42(10): 1466-1489(in Chinese).
    [15] 王玺坤, 姜宏旭, 林珂玉. 基于改进型YOLO算法的遥感图像舰船检测[J]. 北京航空航天大学学报, 2020, 46(6): 1184-1191.

    WANG X K, JIANG H X, LIN K Y. Remote sensing image ship detection based on modified YOLO algorithm[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(6): 1184-1191(in Chinese).
    [16] ZHANG N, DONAHUE J, GIRSHICK R, et al. Part-based R-CNNs for fine-grained category detection[C]//European Conference on Computer Vision. Berlin: Springer, 2014: 834-849.
    [17] GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2016: 1440-1448.
    [18] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [19] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]/2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788.
    [20] 刘元宁, 吴迪, 朱晓冬, 等. 基于YOLOv3改进的用户界面组件检测算法[J]. 吉林大学学报(工学版), 2021, 51(3): 1026-1033.

    LIU Y N, WU D, ZHU X D, et al. User interface components detection algorithm based on improved YOLOv3[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(3): 1026-1033(in Chinese).
    [21] 熊军林, 赵铎. 基于RGB图像的二阶段机器人抓取位置检测方法[J]. 中国科学技术大学学报, 2020, 50(1): 1-10.

    XIONG J L, ZHAO D. Two-stage grasping detection for robots based on RGB images[J]. Journal of University of Science and Technology of China, 2020, 50(1): 1-10(in Chinese).
    [22] TEKIN B, SINHA S N, FUA P. Real-time seamless single shot 6D object pose prediction[C]/2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 292-301.
    [23] KUMRA S, KANAN C. Robotic grasp detection using deep convolutional neural networks[C]// 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2017: 769-776.
  • 加载中
图(12) / 表(3)
计量
  • 文章访问数:  1370
  • HTML全文浏览量:  59
  • PDF下载量:  86
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-03-08
  • 录用日期:  2022-05-06
  • 网络出版日期:  2022-05-26
  • 整期出版日期:  2023-10-31

目录

    /

    返回文章
    返回
    常见问答