引用本文:彭寒梅,颜飞,谭貌,苏永新,李辉.计及气网延时的电-气综合能源系统近端策略优化双智能体安全校正方法[J].电力自动化设备,2025,45(2):51-60.
PENG Hanmei,YAN Fei,TAN Mao,SU Yongxin,LI Hui.PPO-based dual agents security correction method for electricity-gas integrated energy system considering time-delay of gas network[J].Electric Power Automation Equipment,2025,45(2):51-60.
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 4474次   下载 838 本文二维码信息
码上扫一扫!
计及气网延时的电-气综合能源系统近端策略优化双智能体安全校正方法
彭寒梅1, 颜飞1, 谭貌1,2, 苏永新1,2, 李辉1,2
1.湘潭大学 自动化与电子信息学院,湖南 湘潭 411105;2.湘潭大学 湖南省多能协同控制技术工程研究中心,湖南 湘潭 411105
摘要:
电-气区域综合能源系统(EGRIES)中的电、气能源耦合且电力与天然气传输速率存在差异性,使得其安全校正的控制变量多且调整时间尺度不同。为此,提出一种基于双智能体深度强化学习的EGRIES多时间尺度安全校正控制方法。基于EGRIES多能流模型和天然气网络调节较慢的特性,进行控制变量的调整时间尺度分类;构建基于双智能体强化学习的安全校正控制框架,采用合作型双智能体分别进行长时间和短时间尺度控制变量调整量的决策,设计基于近端策略优化(PPO)算法的智能体1和智能体2模型;在此基础上,离线训练PPO双智能体,当系统进入紧急状态时,双智能体相互合作在线产生可靠的安全校正控制策略,使系统恢复到正常状态。算例仿真结果验证了所提方法的有效性。
关键词:  电-气区域综合能源系统  安全校正控制  双智能体  PPO算法  调整时间尺度
DOI:10.16081/j.epae.202411018
分类号:TM73;TK01
基金项目:湖南省自然科学基金资助项目(2023JJ50241);湖南省教育厅科学研究项目(23A0142)
PPO-based dual agents security correction method for electricity-gas integrated energy system considering time-delay of gas network
PENG Hanmei1, YAN Fei1, TAN Mao1,2, SU Yongxin1,2, LI Hui1,2
1.College of Automation and Electronic Information, Xiangtan University, Xiangtan 411105, China;2.Hunan Engineering Research Center of Multi-energy Cooperative Control Technology, Xiangtan University, Xiangtan 411105, China
Abstract:
The coupling of electricity and gas energy sources and the difference of transmission rate between electricity and natural gas in electricity-gas regional integrated energy system(EGRIES) lead to many control variables and different adjustment time scales of security correction. Therefore, a multi-time-scale security correction control method for EGRIES based on dual agents deep reinforcement learning is proposed. Based on the multi-energy flow model of EGRIES and the slow regulation of natural gas network, the adjustment time scale of control variables is classified. A security correction control framework based on dual agents reinforcement learning is constructed. The cooperative two agents are used to make decisions on the adjustment amount of control variables on long-time and short-time scales respectively. The models of Agent 1 and Agent 2 are designed based on the proximal policy optimization(PPO) algorithm. Based on this, the PPO-based dual agents are trained offline. When the system is in an emergency state, the dual agents can cooperate with each other to make reliable security correction control strategies online to restore the system to a normal state. The effectiveness of the proposed method is verified by simulative results.
Key words:  electricity-gas regional integrated energy system  security correction control  dual agents  PPO algo-rithm  adjustment time-scales

用微信扫一扫

用微信扫一扫