高级检索

深空探测器多智能体强化学习自主任务规划

Multi-Agent Reinforcement Learning Autonomous Task Planning for Deep Space Probes

  • 摘要: 针对深空探测器执行附着任务时各子系统协同规划自主性、快速性和自适应性的要求,提出一种基于近端策略优化方法的多智能体强化学习协同规划,将单智能体近端策略优化算法与多智能体混合式协作机制相融合,设计了一种多智能体自主任务规划模型,并引入噪声正则化优势值解决多智能体集中训练中协同策略过拟合的问题。仿真结果表明,多智能体强化学习自主任务规划方法能根据实时环境变化,对智能自主优化小天体附着任务的协作策略适时调整,与改进前的算法相比提高了任务规划成功率和规划解的质量,缩短了任务规划的时间。

     

    Abstract: To meet the requirements for autonomy, rapidity, and adaptability in the collaborative planning of each subsystem during the attachment mission of a deep space probe, a collaborative planning strategy based on proximal policy optimization method and multi-agent reinforcement learning was proposed. By combining the single-agent proximal policy optimization algorithm with the hybrid collaborative mechanism of multi-agent, a multi-agent autonomous task planning model was designed. The noise-regularized advantage value ws introduced to solve the problem of overfitting in the collaborative strategy of multi-agent centralized training. Simulation results show that the multi-agent reinforcement learning collaborative autonomous task planning method can intelligently optimize the collaboration strategy of small celestial body attachment missions according to real-time environmental changes, and compared with the previous algorithm, it improves the success rate of task planning and quality of planning solutions, and shortens the time of task planning.

     

/

返回文章
返回