留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度强化学习的离散状态转移算法求解柔性作业车间调度问题

朱家政 王聪 李新凯 董颖超 张宏立

朱家政,王聪,李新凯,等. 基于深度强化学习的离散状态转移算法求解柔性作业车间调度问题[J]. 北京航空航天大学学报,2025,51(4):1385-1394 doi: 10.13700/j.bh.1001-5965.2023.0211
引用本文: 朱家政,王聪,李新凯,等. 基于深度强化学习的离散状态转移算法求解柔性作业车间调度问题[J]. 北京航空航天大学学报,2025,51(4):1385-1394 doi: 10.13700/j.bh.1001-5965.2023.0211
ZHU J Z,WANG C,LI X K,et al. A deep reinforcement learning based on discrete state transition algorithm for solving fuzzy flexible job shop scheduling problem[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1385-1394 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0211
Citation: ZHU J Z,WANG C,LI X K,et al. A deep reinforcement learning based on discrete state transition algorithm for solving fuzzy flexible job shop scheduling problem[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1385-1394 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0211

基于深度强化学习的离散状态转移算法求解柔性作业车间调度问题

doi: 10.13700/j.bh.1001-5965.2023.0211
基金项目: 国家自然科学基金(52267010,62263030);新疆维吾尔自治区自然科学基金资助项目(2022D01C367,2022D01E33)
详细信息
    通讯作者:

    E-mail:zhlxju@163.com

  • 中图分类号: TP181

A deep reinforcement learning based on discrete state transition algorithm for solving fuzzy flexible job shop scheduling problem

Funds: National Natural Science Foundation of China (52267010,62263030); Natural Science Foundation of Xinjiang Uygur Autonomous Region (2022D01C367,2022D01E33)
More Information
  • 摘要:

    柔性作业车间调度问题(FJSP)作为一种在实际生活中应用广泛的调度问题,对其智能算法具有重要价值。为了解决FJSP,以最小化最大完工时间为优化目标,提出了一种基于近端策略优化的离散状态转移算法(DSTA-PPO)。DSTA-PPO具有3个特点:考虑到FJSP需要同时对工序排序、机器分配同时进行调度安排,结合工序编码和机器编码,设计了一种能够充分表达当前调度问题的状态特征;针对工序排序、机器分配设计了多种基于关键路径的搜索操作;通过强化学习的训练,能够有效地引导智能体选择正确的搜索操作优化当前的调度序列。通过基于不同数据集的仿真实验,验证了算法各环节的有效性,同时在相同算例上以最小化最大完工时间为对比指标与现有算法进行了比较,对比结果表明了所提算法能够在多数算例上以更短的完工时间对算例完成求解,有效地求解了柔性作业车间调度问题。

     

  • 图 1  FJSP的析取图描述

    Figure 1.  Description of FJSP's disjunction graph

    图 2  编码示意图

    Figure 2.  Encoding diagram

    图 3  邻域结构示意图

    Figure 3.  Schematic diagram of neighborhood structure

    图 4  状态特征列表

    Figure 4.  Status feature list

    图 5  因素的趋势变化

    Figure 5.  Trend changes in factors

    图 6  箱线图

    Figure 6.  Box plot

    图 7  MK10算例调度图

    Figure 7.  MK10 example scheduling diagram

    图 8  Vdata_la40算例调度图

    Figure 8.  Vdata_la40 Arithmetic scheduling diagram

    表  1  3×3的FJSP实例

    Table  1.   3×3 FJSP instance

    工件 工序 工序号 工序和加工时间
    M1 M2 M3
    J1 O11 1 1 2 *
    O12 2 3 2 1
    J2 O21 3 * 2 1
    O22 4 1 2 4
    O23 5 * * 4
    J3 O31 6 3 2 2
    O32 7 1 * 2
     注:“*”表示该工序无法在该机器上进行加工。
    下载: 导出CSV

    表  2  正交试验表及各参数组合的响应值

    Table  2.   Orthogonal test table and response values of various parameter combinations

    编号 参数因子组合 RV
    SE α batch mini-batch
    1 1 1 1 1 3.054784
    2 1 2 2 2 3.181015
    3 1 3 3 3 3.08003
    4 1 4 4 4 2.928553
    5 2 1 2 3 2.802323
    6 2 2 1 4 2.575107
    7 2 3 4 1 2.600353
    8 2 4 3 2 2.827569
    9 3 1 3 4 2.2974
    10 3 2 4 3 2.398384
    11 3 3 1 2 2.019692
    12 3 4 2 1 2.09543
    13 4 1 4 2 2.322646
    14 4 2 3 1 2.171169
    15 4 3 2 4 2.145923
    16 4 4 1 3 2.196415
    下载: 导出CSV

    表  3  参数各水平的极差分析

    Table  3.   Extreme variance analysis of parameters at different levels

    参数因子 K Kavg 最佳水平 R 水平数量 每水平重复数
    1 2 3 4 1 2 3 4
    SE 12.24 10.81 8.81 8.84 3.06 2.7 2.2 2.21 3 −0.86 4 4
    α 10.48 10.33 9.85 10.05 2.62 2.58 2.46 2.51 3 −0.16 4 4
    batch 9.85 10.22 10.38 10.25 2.46 2.56 2.59 2.56 1 −0.13 4 4
    mini-batch 9.92 10.35 10.48 9.95 2.48 2.59 2.62 2.49 1 −0.14 4 4
    下载: 导出CSV

    表  4  第一类测试集对比

    Table  4.   Comparison of the first type of dataset

    数据集 最优解 DSTA SLGA 集成强化学习 DSTA-PPO
    Mk01 36 40* 40* 42 40*
    Mk02 24 28 27 31 26*
    Mk03 204 204 204* 204* 204*
    Mk04 48 66 60* 69 60*
    Mk05 168 178 172* 180 172*
    Mk06 33 73 69 68 62
    Mk07 133 147 144 153 142
    Mk08 523 523* 523* 523* 523*
    Mk09 299 327 320 315 307*
    Mk10 165 263 254 230 225
    下载: 导出CSV

    表  5  第二类测试集对比

    Table  5.   Comparison of the second type of dataset

    Dataset Val gap/%
    DSTA Multi-PPO DSTA-PPO UB DSTA Multi-PPO DSTA-PPO
    10×5 Vdata_la1 613 610 573 570 7.54 7.02 0.53
    Vdata_la2 557 555 536 529 5.29 4.91 1.32
    Vdata_la3 531 532 487 477 11.32 11.53 2.10
    Vdata_la4 538 530 510 502 7.17 5.58 1.59
    Vdata_la5 490 507 467 457 7.22 10.94 2.19
    15×5 Vdata_la6 824 820 802 799 3.13 2.63 0.38
    Vdata_la7 788 757 755 749 5.21 1.07 0.80
    Vdata_la8 791 782 765 765 3.40 2.22 0.00
    Vdata_la9 925 879 859 853 8.44 3.05 0.70
    Vdata_la10 830 862 808 804 3.23 7.21 0.50
    20×5 Vdata_la11 1102 1101 1074 1071 2.89 2.80 0.28
    Vdata_la12 960 950 939 936 2.56 1.50 0.32
    Vdata_la13 1070 1053 1042 1038 3.08 1.45 0.39
    Vdata_la14 1095 1086 1071 1070 2.34 1.56 0.09
    Vdata_la15 1117 1111 1092 1089 2.57 2.02 0.28
    10×10 Vdata_la16 717 717 717 717 0.00 0.00 0.00
    Vdata_la17 646 647 646 646 0.00 0.15 0.00
    Vdata_la18 714 663 663 663 7.69 0.00 0.00
    Vdata_la19 705 626 644 617 14.26 1.46 4.38
    Vdata_la20 756 756 756 756 0.00 0.00 0.00
    15×10 Vdata_la21 903 887 870 804 12.31 10.32 8.21
    Vdata_la22 849 793 798 736 15.35 7.74 8.42
    Vdata_la23 970 858 888 815 19.02 5.28 8.96
    Vdata_la24 891 883 834 775 14.97 13.94 7.61
    Vdata_la25 896 883 820 756 18.52 16.80 8.47
    20×10 Vdata_la26 1195 1089 1100 1054 13.38 3.32 4.36
    Vdata_la27 1231 1123 1133 1084 13.56 3.60 4.52
    Vdata_la28 1249 1106 1121 1070 16.73 3.36 4.77
    Vdata_la29 1121 1049 1048 994 12.78 5.53 5.43
    Vdata_la30 1256 1117 1122 1069 17.49 4.49 4.96
    30×10 Vdata_la31 1688 1561 1553 1520 11.05 2.70 2.17
    Vdata_la32 1837 1693 1697 1658 10.80 2.11 2.35
    Vdata_la33 1630 1531 1530 1497 8.88 2.27 2.20
    Vdata_la34 1731 1562 1565 1537 12.62 1.63 1.82
    Vdata_la35 1711 1574 1582 1549 10.46 1.61 2.13
    15×15 Vdata_la36 1100 985 967 948 16.03 3.90 2.00
    Vdata_la37 1171 1028 1035 986 18.76 4.26 4.97
    Vdata_la38 1061 948 943 943 12.51 0.53 0.00
    Vdata_la39 1109 976 959 922 20.28 6.18 4.01
    Vdata_la40 1041 968 955 955 9.01 1.36 0.00
     注:Ave.Gap: DSTA、Multi-ppo、DSTA-PPO分别为9.55、4.20、2.58。
    下载: 导出CSV
  • [1] THAMES L, SCHAEFER D. Software-defined cloud manufacturing for industry 4.0[J]. Procedia CIRP, 2016, 52: 12-17. doi: 10.1016/j.procir.2016.07.041
    [2] GOLPÎRA H, KHAN S A R, SAFAEIPOUR S. A review of logistics Internet-of-things: current trends and scope for future research[J]. Journal of Industrial Information Integration, 2021, 22: 100194. doi: 10.1016/j.jii.2020.100194
    [3] LIU Y K, WANG L H, WANG X V, et al. Scheduling in cloud manufacturing: state-of-the-art and research challenges[J]. International Journal of Production Research, 2019, 57(15-16): 4854-4879. doi: 10.1080/00207543.2018.1449978
    [4] FANG Y L, PENG C, LOU P, et al. Digital-twin-based job shop scheduling toward smart manufacturing[J]. IEEE Transactions on Industrial Informatics, 2019, 15(12): 6425-6435. doi: 10.1109/TII.2019.2938572
    [5] ZHANG Y F, WANG J, LIU S C, et al. Game theory based real‐time shop floor scheduling strategy and method for cloud manufacturing[J]. International Journal of Intelligent Systems, 2017, 32(4): 437-463. doi: 10.1002/int.21868
    [6] GAO K Z, CAO Z G, ZHANG L, et al. A review on swarm intelligence and evolutionary algorithms for solving flexible job shop scheduling problems[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6(4): 904-916. doi: 10.1109/JAS.2019.1911540
    [7] CHOI I C, CHOI D S. A local search algorithm for jobshop scheduling problems with alternative operations and sequence-dependent setups[J]. Computers & Industrial Engineering, 2002, 42(1): 43-58.
    [8] ZHANG G H, SHAO X Y, LI P G, et al. An effective hybrid particle swarm optimization algorithm for multi-objective flexible job-shop scheduling problem[J]. Computers & Industrial Engineering, 2009, 56(4): 1309-1318.
    [9] NOUIRI M, BEKRAR A, JEMAI A, et al. Two stage particle swarm optimization to solve the flexible job shop predictive scheduling problem considering possible machine breakdowns[J]. Computers & Industrial Engineering, 2017, 112: 595-606.
    [10] NOUIRI M, BEKRAR A, JEMAI A, et al. An effective and distributed particle swarm optimization algorithm for flexible job-shop scheduling problem[J]. Journal of Intelligent Manufacturing, 2018, 29(3): 603-615. doi: 10.1007/s10845-015-1039-3
    [11] XING L N, CHEN Y W, WANG P, et al. A knowledge-based ant colony optimization for flexible job shop scheduling problems[J]. Applied Soft Computing, 2010, 10(3): 888-896. doi: 10.1016/j.asoc.2009.10.006
    [12] JIANG T H, ZHANG C. Application of grey wolf optimization for solving combinatorial problems: job shop and flexible job shop scheduling cases[J]. IEEE Access, 2018, 6: 26231-26240. doi: 10.1109/ACCESS.2018.2833552
    [13] ZHOU X J, YANG C H, GUI W H. State transition algorithm[J]. Journal of Industrial & Management Optimization, 2012, 8(4): 1039-1056.
    [14] 董天雪, 阳春华, 周晓君, 等. 一种求解企业员工指派问题的离散状态转移算法[J]. 控制理论与应用, 2016, 33(10): 1378-1388. doi: 10.7641/CTA.2016.50982

    DONG T X, YANG C H, ZHOU X J, et al. A novel discrete state transition algorithm for staff assignment problem[J]. Control Theory & Applications, 2016, 33(10): 1378-1388(in Chinese). doi: 10.7641/CTA.2016.50982
    [15] 张凤雪, 阳春华, 周晓君, 等. 基于控制周期计算的锌液净化除铜过程优化控制[J]. 控制理论与应用, 2017, 34(10): 1388-1395. doi: 10.7641/CTA.2017.60621

    ZHANG F X, YANG C H, ZHOU X J, et al. Optimal control based on control period calculation for copper removal process of zinc solution purification[J]. Control Theory & Applications, 2017, 34(10): 1388-1395(in Chinese). doi: 10.7641/CTA.2017.60621
    [16] DU Y, LI J Q, CHEN X L, et al. Knowledge-based reinforcement learning and estimation of distribution algorithm for flexible job shop scheduling problem[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022, 7(4): 1036-1050.
    [17] CHEN R H, YANG B, LI S, et al. A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem[J]. Computers & Industrial Engineering, 2020, 149: 106778.
    [18] LEI K, GUO P, ZHAO W C, et al. A multi-action deep reinforcement learning framework for flexible Job-shop scheduling problem[J]. Expert Systems with Applications, 2022, 205: 117796. doi: 10.1016/j.eswa.2022.117796
    [19] 张凯, 毕利, 焦小刚. 集成强化学习算法的柔性作业车间调度问题研究[J]. 中国机械工程, 2023, 34(2): 201-207. doi: 10.3969/j.issn.1004-132X.2023.02.010

    ZHANG K, BI L, JIAO X G. Research on flexible job-shop scheduling problems with integrated reinforcement learning algorithm[J]. China Mechanical Engineering, 2023, 34(2): 201-207(in Chinese). doi: 10.3969/j.issn.1004-132X.2023.02.010
    [20] 王凌, 王晶晶. 考虑运输时间的分布式绿色柔性作业车间调度协同群智能优化[J]. 中国科学: 技术科学: 1-15.
  • 加载中
图(8) / 表(5)
计量
  • 文章访问数:  254
  • HTML全文浏览量:  73
  • PDF下载量:  6
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-04-26
  • 录用日期:  2023-07-27
  • 网络出版日期:  2023-09-12
  • 整期出版日期:  2025-04-30

目录

    /

    返回文章
    返回
    常见问答