引用本文: | 彭寒梅,颜飞,谭貌,苏永新,李辉.计及气网延时的电-气综合能源系统近端策略优化双智能体安全校正方法[J].电力自动化设备,2025,45(2):51-60. |
| PENG Hanmei,YAN Fei,TAN Mao,SU Yongxin,LI Hui.PPO-based dual agents security correction method for electricity-gas integrated energy system considering time-delay of gas network[J].Electric Power Automation Equipment,2025,45(2):51-60. |
|
摘要: |
电-气区域综合能源系统(EGRIES)中的电、气能源耦合且电力与天然气传输速率存在差异性,使得其安全校正的控制变量多且调整时间尺度不同。为此,提出一种基于双智能体深度强化学习的EGRIES多时间尺度安全校正控制方法。基于EGRIES多能流模型和天然气网络调节较慢的特性,进行控制变量的调整时间尺度分类;构建基于双智能体强化学习的安全校正控制框架,采用合作型双智能体分别进行长时间和短时间尺度控制变量调整量的决策,设计基于近端策略优化(PPO)算法的智能体1和智能体2模型;在此基础上,离线训练PPO双智能体,当系统进入紧急状态时,双智能体相互合作在线产生可靠的安全校正控制策略,使系统恢复到正常状态。算例仿真结果验证了所提方法的有效性。 |
关键词: 电-气区域综合能源系统 安全校正控制 双智能体 PPO算法 调整时间尺度 |
DOI:10.16081/j.epae.202411018 |
分类号:TM73;TK01 |
基金项目:湖南省自然科学基金资助项目(2023JJ50241);湖南省教育厅科学研究项目(23A0142) |
|
PPO-based dual agents security correction method for electricity-gas integrated energy system considering time-delay of gas network |
PENG Hanmei1, YAN Fei1, TAN Mao1,2, SU Yongxin1,2, LI Hui1,2
|
1.College of Automation and Electronic Information, Xiangtan University, Xiangtan 411105, China;2.Hunan Engineering Research Center of Multi-energy Cooperative Control Technology, Xiangtan University, Xiangtan 411105, China
|
Abstract: |
The coupling of electricity and gas energy sources and the difference of transmission rate between electricity and natural gas in electricity-gas regional integrated energy system(EGRIES) lead to many control variables and different adjustment time scales of security correction. Therefore, a multi-time-scale security correction control method for EGRIES based on dual agents deep reinforcement learning is proposed. Based on the multi-energy flow model of EGRIES and the slow regulation of natural gas network, the adjustment time scale of control variables is classified. A security correction control framework based on dual agents reinforcement learning is constructed. The cooperative two agents are used to make decisions on the adjustment amount of control variables on long-time and short-time scales respectively. The models of Agent 1 and Agent 2 are designed based on the proximal policy optimization(PPO) algorithm. Based on this, the PPO-based dual agents are trained offline. When the system is in an emergency state, the dual agents can cooperate with each other to make reliable security correction control strategies online to restore the system to a normal state. The effectiveness of the proposed method is verified by simulative results. |
Key words: electricity-gas regional integrated energy system security correction control dual agents PPO algo-rithm adjustment time-scales |