基于时序深度强化学习的虚拟电厂预测-决策一体化调度

孙国强; 王希文; 周亦洲; 孙康; 卫志农; 臧海祥

引用本文:	孙国强,王希文,周亦洲,孙康,卫志农,臧海祥.基于时序深度强化学习的虚拟电厂预测-决策一体化调度[J].电力自动化设备,2026,46(5):164-171
	Sun Guoqiang,Wang Xiwen,Zhou Yizhou,Sun Kang,Wei Zhinong,Zang Haixiang.Forecasting-decision integrated scheduling of virtual power plant based on temporal deep reinforcement learning[J].Electric Power Automation Equipment,2026,46(5):164-171

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 160次下载 18次	码上扫一扫！
字体:加大+\|默认\|缩小-
基于时序深度强化学习的虚拟电厂预测-决策一体化调度
孙国强, 王希文, 周亦洲, 孙康, 卫志农, 臧海祥
河海大学电气与动力工程学院，江苏南京 211100

摘要:

随着新能源在虚拟电厂中的渗透率不断提高，传统依赖精准预测的预测-决策序贯调度方法因其误差传递与目标不一致性，难以适应复杂不确定性环境下的虚拟电厂调度需求。为此，提出一种基于时序深度强化学习的虚拟电厂预测-决策一体化调度方法，将原始天气数据视为调度因素直接输入神经网络进行训练。建立含光伏机组、燃气轮机、储能装置的虚拟电厂聚合模型，并提出虚拟电厂预测-决策一体化调度架构；将虚拟电厂预测-决策一体化调度问题表述为连续状态空间下的强化学习框架，定义其环境状态集、智能体动作集等；采用考虑时序特性的深度确定性策略梯度算法进行训练。仿真结果表明，所提方法能通过环境反馈进行自适应学习，有效解决预测误差传递问题，提高虚拟电厂的自适应性和安全性。

关键词: 虚拟电厂优化调度深度强化学习时序建模预测-决策一体化调度

DOI：10.16081/j.epae.202512030

分类号:TM73

基金项目:国家自然科学基金资助项目(U24B2088)

Forecasting-decision integrated scheduling of virtual power plant based on temporal deep reinforcement learning

Sun Guoqiang, Wang Xiwen, Zhou Yizhou, Sun Kang, Wei Zhinong, Zang Haixiang

School of Electrical and Power Engineering, Hohai University, Nanjing 211100, China

Abstract:

Along with the continuous increase of the penetration of renewable resources in virtual power plant, the traditional forecasting-decision sequential scheduling method which depends on accuracy forecasting is difficult to adapt to the scheduling demand of virtual power plant under complex uncertainty environment because of the inconsistency between its error propagation and the object. Therefore, a forecasting-decision integrated scheduling method for virtual power plant is proposed based on temporal deep reinforcement learning. The raw weather data is regarded as the scheduling factor and directly input into the neural network for training. A virtual power plant aggregation model containing photovoltaic units, gas turbines, and energy storage devices is established, and a forecasting-decision integrated scheduling architecture is proposed. The forecasting-decision integrated scheduling problem of virtual power plant is described as a reinforcement learning framework under continuous state space, and its environmental state set and agent action set are defined. A deep deterministic policy gradient algorithm considering the temporal feature is adopted for training. The simulation results show that the proposed method can adaptively study through environment feedback, effectively solve the forecasting error propagation problem, and improve the self-adaptation and security of virtual power plant.

Key words: virtual power plant optimal scheduling deep reinforcement learning temporal modeling forecasting-decision integrated scheduling

用微信扫一扫