《电力自动化设备》

引用本文:	吴熙,唐子逸,徐青山,周亦洲.基于Q学习算法的综合能源系统韧性提升方法[J].电力自动化设备,2020,40(4):
	WU Xi,TANG Ziyi,XU Qingshan,ZHOU Yizhou.Q-learning algorithm based method for enhancing resiliency of integrated energy system[J].Electric Power Automation Equipment,2020,40(4):

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 4179次下载 1885次
字体:加大+\|默认\|缩小-
基于Q学习算法的综合能源系统韧性提升方法
吴熙¹, 唐子逸¹, 徐青山¹, 周亦洲²
1.东南大学电气工程学院，江苏南京 210096;2.河海大学能源与电气学院，江苏南京 210098

摘要:

将综合能源系统随机动态优化问题建模为马尔可夫决策过程，并引入Q学习算法实现该复杂问题的求解。针对Q学习算法的弊端，对传统的Q学习算法做了2个改进：改进了Q值表初始化方法，采用置信区间上界算法进行动作选择。仿真结果表明：Q学习算法在实现问题求解的同时保证了较好的收敛性，改进的初始化方法和采用的置信区间上界算法能显著提高计算效率，使结果收敛到更优解；与常规混合整数线性规划模型相比，Q学习算法具有更好的优化结果。

关键词: 综合能源系统孤岛运行马尔可夫决策过程 Q学习算法韧性

DOI：10.16081/j.epae.202002006

分类号:TM73;TK01

基金项目:国家电网公司科技项目(SGJSJX00YJJS1800721)；国家自然科学基金重点资助项目(51936003)

Q-learning algorithm based method for enhancing resiliency of integrated energy system

WU Xi¹, TANG Ziyi¹, XU Qingshan¹, ZHOU Yizhou²

1.School of Electrical Engineering, Southeast University, Nanjing 210096, China;2.College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China

Abstract:

The stochastic dynamic optimization problem of integrated energy system is modeled as a Markov decision process, and Q-learning algorithm is introduced to solve this complex problem. In order to overcome the disadvantages of Q-learning algorithm, two improvements are made to the typical Q-learning: the Q table initialization method is improved and the upper bound convergence algorithm is adopted for the action selection. Simulative results show that Q-learning algorithm ensures better convergence while solving the problem, and the improved initialization method and the upper bound convergence algorithm can significantly improve the computational efficiency and make the results converge to a better solution. Moreover, compared with the conventional mixed integer linear programming model, Q-learning algorithm achieves better optimization results.

Key words: integrated energy system islanded operation Markov decision process Q-learning algorithm resiliency

用微信扫一扫