“嫦娥五号”月面采样机械臂路径规划

Path Planning of Lunar Surface Sampling Manipulator for Chang'E-5 Mission

摘要: 针对“嫦娥五号”月面采样任务中采样机械臂的精准控制问题，提出了一种基于深度强化学习的路径规划方法。通过设计深度强化学习算法的多约束奖赏函数，规划了满足安全性、快速性、可达性3个约束的运动路径，实现了采样机械臂的精准控制。在满足任务安全性的提前下，缩短了天地之间的交互时间，机械臂控制效果平稳。在轨实验结果表明，该方法具有较高的准确性和鲁棒性，可为后续的深空探测在轨遥操作采样任务提供借鉴。

Abstract: Aiming at the problem of precise control of the sampling manipulator in the lunar surface sampling mission of "Chang'E-5", a path planning method based on deep reinforcement learning is proposed. By designing the multi-constraint reward function of the deep reinforcement learning algorithm, a motion path that satisfies the three constraints of safety, speed and reachability is planned. The precise control of the sampling robotic arm is realized. Under the advance of meeting the task safety, the interaction time between heaven and earth is greatly shortened, and the control effect of the manipulator is more stable. Experimental results show that this method has high accuracy and robustness, and can provide reference for subsequent on orbit sampling tasks.