Advanced Search
MAO Weiyang, WANG Bin, LIU Jingxing, XIONG Xin. An Autonomous Planning Method for Deep Space Exploration Tasks in Reinforcement Learning Based on Dynamic Rewards[J]. Journal of Deep Space Exploration, 2023, 10(2): 220-230. DOI: 10.15982/j.issn.2096-9287.2023.20220049
Citation: MAO Weiyang, WANG Bin, LIU Jingxing, XIONG Xin. An Autonomous Planning Method for Deep Space Exploration Tasks in Reinforcement Learning Based on Dynamic Rewards[J]. Journal of Deep Space Exploration, 2023, 10(2): 220-230. DOI: 10.15982/j.issn.2096-9287.2023.20220049

An Autonomous Planning Method for Deep Space Exploration Tasks in Reinforcement Learning Based on Dynamic Rewards

  • Aiming at the characteristics of multi-system parallelism and the need to meet various constraints in the proceAiming at the characteristics of multi-system parallelism and the need to meet various constraints in the process of autonomous mission planning of deep space detectors, a reinforcement learning task autonomous planning model construction method for deep space detectors was proposed based on dynamic rewards, and a deep space detector agent was established. In the interactive environment, a policy network and a loss function integrating resource constraints, time constraints and timing constraints were constructed, and a dynamic reward mechanism was proposed to improve the traditional policy gradient learning method. The simulation results show that the method in this paper could realize autonomous task planning. Compared with the static reward policy gradient algorithm, the planning success rate and planning efficiency were significantly improved, and the method could start planning in any state without changing the model structure, which improved the accuracy of the algorithm. This method provides a new solution for autonomous mission planning and decision-making of deep space probes.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return