In recent years, with the continuous development of reinforcement learning (RL), we have seen promising results in processing continuous action RL tasks 1,2,3,4,5. In dealing with some continuous ...