Abstract:
To meet the requirements for autonomy, rapidity, and adaptability in the collaborative planning of each subsystem during the attachment mission of a deep space probe, a collaborative planning strategy based on proximal policy optimization method and multi-agent reinforcement learning was proposed. By combining the single-agent proximal policy optimization algorithm with the hybrid collaborative mechanism of multi-agent, a multi-agent autonomous task planning model was designed. The noise-regularized advantage value ws introduced to solve the problem of overfitting in the collaborative strategy of multi-agent centralized training. Simulation results show that the multi-agent reinforcement learning collaborative autonomous task planning method can intelligently optimize the collaboration strategy of small celestial body attachment missions according to real-time environmental changes, and compared with the previous algorithm, it improves the success rate of task planning and quality of planning solutions, and shortens the time of task planning.