On-line building energy optimization using deep reinforcement learning

E. Mocanu (Corresponding author), D.C. Mocanu, P.H. Nguyen, A. Liotta, M.E. Webber, M. Gibescu, J.G. Slootweg

Research output: Contribution to journalArticleAcademicpeer-review

242 Citations (Scopus)
173 Downloads (Pure)


Unprecedented high volumes of data are becoming available with the growth of the advanced metering infrastructure. These are expected to benefit planning and operation of the future power systems and to help customers transition from a passive to an active role. In this paper, we explore for the first time in the smart grid context the benefits of using deep reinforcement learning, a hybrid type of methods that combines reinforcement learning with deep learning, to perform on-line optimization of schedules for building energy management systems. The learning procedure was explored using two methods, Deep Q-learning and deep policy gradient, both of which have been extended to perform multiple actions simultaneously. The proposed approach was validated on the large-scale Pecan Street Inc. database. This highly dimensional database includes information about photovoltaic power generation, electric vehicles and buildings appliances. Moreover, these on-line energy scheduling strategies could be used to provide real-time feedback to consumers to encourage more efficient use of electricity.

Original languageEnglish
Article number8356086
Pages (from-to)3698-3708
Number of pages11
JournalIEEE Transactions on Smart Grid
Issue number4
Publication statusPublished - Jul 2019


  • Buildings
  • Deep Neural Networks
  • Deep Reinforcement Learning
  • Demand Response
  • Energy consumption
  • Learning (artificial intelligence)
  • Machine learning
  • Minimization
  • Optimization
  • Smart Grid
  • Smart grids
  • Strategic Optimization.
  • smart grid
  • strategic optimization
  • Deep reinforcement learning
  • deep neural networks
  • demand response


Dive into the research topics of 'On-line building energy optimization using deep reinforcement learning'. Together they form a unique fingerprint.

Cite this