Abstract
2 min readThis study mainly introduces the energy management of a hybrid energy system using artificial intelligence, which is composed of PPO-based operating cost optimization and SAC-based multi-objective optimization.In the PPO-based operating cost optimization, a dynamic power conversion algorithm between electricity and heat sub-models in the IES is investigated, wherein the SO can adaptively decide the wind power conversion ratio using the DRL methodology according to the electricity price of the upper-level grid, CUs' energy demand profiles, and wind power generation. This study first formulated the dynamic power conversion to a finite discrete MDP and employed PPO to solve this decision-making problem. Through the process of using the DRL methodology, the SO (agent) does not require a prespecified model of the IES (environment) on which an energy conversion ratio (action) needs to be selected. It is capable of responding to a dynamic and changing environment through constant online learning, which considers the flexibility of wind power generation and electricity prices, and the uncertainty in CUs' load demand profiles. Finally, the numerical simulation results showed that this proposed dynamic power conversion algorithm can effectively minimize the operating cost of the system to promote the SO's profit.In the second study, a SAC-based energy management plan is put forth as a solution to scheduling issues that are both dependable and affordable. In addition, taking into account the various complex uncertainties present in the IPHNGE system - such as the variability of load demands and the intermittent nature of wind energy - the DRL-based agent can formulate an energy scheduling plan in real-time. Furthermore, this problem is converted into the constrained optimum control problem with several optimization targets in addition to a variety of restrictions. Subsequently, the optimization problem is formulated as a decision-making problem using a finite discrete MDP, making it amenable to solution by the SAC algorithm. Following the implementation of the current strategy, the state, the control signals (action), and the associated reward are recorded in the experience buffer replay. Based on the numerous interactions above, the strategy is significantly learned to improve the reward. Finally, numerical cases conducted on the IEEE 39-bus power system, a 6-node heating system, and a 20-node gas system demonstrate that the proposed SAC-based energy management strategy achieves better optimization results and constraint satisfaction than the benchmark RL algorithms and the conventional optimization algorithm.
Discussion(0)
No comments yet. Be the first to comment.