时滞普遍存在于各种控制系统中,如果忽略控制系统中时滞的影响可能会降低控制器的控制效果,甚至导致发散。本文研究了时滞对强化学习振动控制器性能的影响。首先,利用有限元方法建立了压电悬臂梁的动力学模型,通过实验辨识修正了动力学模型参数。进而,仿真分析了不同时滞大小对PD控制和基于近端优化策略(PPO)的强化学习控制效果的影响。然后,在不同时滞条件下训练了多个强化学习时滞控制器,并对强化学习控制效果进行了仿真及实验验证。最后,评估了强化学习时滞控制器对时滞偏差的鲁棒性。结果显示,强化学习时滞控制器不仅在所对应的时滞条件下具有良好的控制效果,还对实际时滞偏差有一定容忍范围,具有良好鲁棒性。
Abstract
The presence of time delays in various control systems can have a significant impact on the performance of controllers. Ignoring time delays may result in reduced control effectiveness and even instability. This study investigates the effects of time delays on reinforcement learning based vibration controller. Firstly, a dynamic model of a piezoelectric cantilever beam is established using the finite element method, and the parameters of the dynamic model are corrected using experimental identification methods. Subsequently, the impact of different time delay conditions on the Proximal Policy Optimization (PPO)-based reinforcement learning (RL) controller and the PD controller are simulated and analyzed. Then, multiple reinforcement learning time-delay controllers are trained under different time-delay conditions, and the control effect of the time-delay controller is simulated and experimentally verified. Finally, the robustness of the reinforcement learning time-delay controller to time delay deviations is evaluated. The results show that the reinforcement learning time-delay controller not only has good control performance under the corresponding time delay conditions but also has a certain tolerance range for actual time delay deviations, demonstrating good robustness.
关键词
强化学习 /
近端优化策略 /
时滞 /
振动控制
{{custom_keyword}} /
Key words
reinforcement learning /
PPO /
time-delay /
vibration control
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 冯焕, 庞爱平, 周鸿博, 等. 大型柔性航天器的高精度结构化综合控制[J]. 科学技术与工程, 2022,22(29):12909-12916.
[2] 蔡国平•结构振动主动控制[M]. 北京:科学出版社,2021.
[3] 周嘉明, 董龙雷, 孟超, 等. 基于强化学习的随机振动主动控制策略[J]. 振动与冲击, 2021,40(16):281-286.
[4] 陈孝聪, 张恩启, 程斌, 等. 基于深度强化学习的拉索智能减振算法[J]. 振动与冲击, 2022,41(23):175-181.
[5] Xu R, Li D, Jiang J. An online learning-based fuzzy control method for vibration control of smart solar panel[J]. Journal of Intelligent Material Systems and Structures, 2015,26(18):2547-2555.
[6] Qiu Z, Chen G, Zhang X. Reinforcement learning vibration control for a flexible hinged plate[J]. Aerospace Science and Technology, 2021,118:107056.
[7] Qiu Z, Du J, Zhang X. Vibration control of three coupled flexible beams using reinforcement learning algorithm based on proximal policy optimization[J]. Journal of Intelligent Material Systems and Structures, 2022,33(20):2578-2603.
[8] Tao Z, Yian D, Fan H, et al. Reducing vibration of a rotating machine with deep reinforcement learning: 2020 IEEE International Conference on Mechatronics and Automation (ICMA)[C], 2020. IEEE.
[9] Ouyang Y, He W, Li X. Reinforcement learning control of a single‐link flexible robotic manipulator[J]. IET Control Theory & Applications, 2017,11(9):1426-1433.
[10] Zhang T, Chu H, Zou Y, et al. A deep reinforcement learning-based optimization method for vibration suppression of articulated robots[J]. Engineering Optimization, 2023,55(7):1189-1206.
[11] Landman R, Haffert S Y, Radhakrishnan V M, et al. Self-optimizing adaptive optics control with reinforcement learning for high-contrast imaging[J]. Journal of Astronomical Telescopes, Instruments, and Systems, 2021,7(3):39002.
[12] 罗梦翔, 高明周, 蔡国平. 机翼颤振的时滞反馈控制研究[J]. 振动与冲击, 2016,35(18):58-61.
[13] 宋攀, 董兴建, 孟光. 柔性基础主动隔振系统的缩聚建模和时滞问题研究[J]. 振动与冲击, 2012,31(23):57-61.
[14] 李美超, 陈龙祥, 蔡国平. 不确定线性时滞系统模型参考自适应控制研究[J]. 应用力学学报, 2018,35(06):1207-1213.
[15] 孙洪鑫, 李建强, 王修勇, 等. 基于磁致伸缩作动器的拉索主动控制时滞补偿研究[J]. 振动与冲击, 2017,36(14):208-215.
[16] 吴彪, 闫盖, 李佩琳, 等. 考虑时滞的主动悬架系统控制策略对比研究[J]. 力学季刊, 2023,44(01):75-87.
[17] 李非凡, 赵艳影. 基于时滞半主动控制的起落架摆振反共振峰优化[J]. 振动与冲击, 2023,42(08):341-350.
[18] Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017.
[19] 卢志荣, 王晓明, 周文雅. MFC驱动主动反射器形面的有限时间动态变形控制[J]. 振动与冲击, 2023,42(04):325-332.
[20] 孙杰, 黄庭轩, 朱东方, 等. 基于压电纤维复合材料的航天器动力学建模与振动抑制[J]. 飞控与探测, 2019,2(03):70-76.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}