引用本文:吴熙,唐子逸,徐青山,周亦洲.基于Q学习算法的综合能源系统韧性提升方法[J].电力自动化设备,2020,40(4):
WU Xi,TANG Ziyi,XU Qingshan,ZHOU Yizhou.Q-learning algorithm based method for enhancing resiliency of integrated energy system[J].Electric Power Automation Equipment,2020,40(4):
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 4179次   下载 1885  
基于Q学习算法的综合能源系统韧性提升方法
吴熙1, 唐子逸1, 徐青山1, 周亦洲2
1.东南大学 电气工程学院,江苏 南京 210096;2.河海大学 能源与电气学院,江苏 南京 210098
摘要:
将综合能源系统随机动态优化问题建模为马尔可夫决策过程,并引入Q学习算法实现该复杂问题的求解。针对Q学习算法的弊端,对传统的Q学习算法做了2个改进:改进了Q值表初始化方法,采用置信区间上界算法进行动作选择。仿真结果表明:Q学习算法在实现问题求解的同时保证了较好的收敛性,改进的初始化方法和采用的置信区间上界算法能显著提高计算效率,使结果收敛到更优解;与常规混合整数线性规划模型相比,Q学习算法具有更好的优化结果。
关键词:  综合能源系统  孤岛运行  马尔可夫决策过程  Q学习算法  韧性
DOI:10.16081/j.epae.202002006
分类号:TM73;TK01
基金项目:国家电网公司科技项目(SGJSJX00YJJS1800721);国家自然科学基金重点资助项目(51936003)
Q-learning algorithm based method for enhancing resiliency of integrated energy system
WU Xi1, TANG Ziyi1, XU Qingshan1, ZHOU Yizhou2
1.School of Electrical Engineering, Southeast University, Nanjing 210096, China;2.College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China
Abstract:
The stochastic dynamic optimization problem of integrated energy system is modeled as a Markov decision process, and Q-learning algorithm is introduced to solve this complex problem. In order to overcome the disadvantages of Q-learning algorithm, two improvements are made to the typical Q-learning: the Q table initialization method is improved and the upper bound convergence algorithm is adopted for the action selection. Simulative results show that Q-learning algorithm ensures better convergence while solving the problem, and the improved initialization method and the upper bound convergence algorithm can significantly improve the computational efficiency and make the results converge to a better solution. Moreover, compared with the conventional mixed integer linear programming model, Q-learning algorithm achieves better optimization results.
Key words:  integrated energy system  islanded operation  Markov decision process  Q-learning algorithm  resiliency

用微信扫一扫

用微信扫一扫