Deep clustering of reinforcement learning based on the bang-bang principle to optimize the energy in multi-boiler for intelligent buildings

Raad Z. Homod*, Basil Sh Munahi, Hayder Ibrahim Mohammed, Musatafa Abbas Abbood Albadr, AISSA Abderrahmane, Jasim M. Mahdi, Mohamed Bechir Ben Hamida, Bilal Naji Alhasnawi, A. S. Albahri, Hussein Togun, Umar F. Alqsair, Zaher Mundher Yaseen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


The bang-bang relays of the multiple-boiler system (MBS) control, are characterized by complex limiter saturation functions and classified as fixed parameters. Their action signals cannot precisely control the nonlinear dynamic building heating demand over their entire range of operation. Moreover, in a mono-boiler system, the bang-bang controller endures increasing short cycling over partial load time due to the heating system being considered to have an oversized boiler at most times of running, thus promoting high energy consumption and fluctuating indoor thermal comfort. So, it is difficult to cope with uncertainties in outdoor environments and indoor heating load. Hence, this study formulates the MBS control problem as a dynamic Markov decision process and applies a deep clustering of reinforcement learning approach to obtain the optimal control policy through interaction with the environment based on multi-agent learning according to bang-bang action. With such an approach, adopting a new boiler sequencing control (BSC) strategy using deep clustering of reinforcement learning based on a bang-bang (DCRLBB) manner. The deep clustering is configured to break Lagrangian trajectory curves into piecewise segments to represent the RL agent's action policy. The agent's action policy signals are configured from the bang-bang reward formula based on trade-off implications to be more adjustable than traditional fixed parameters such as fuzzy bang-bang controller (FBBC). The agent of BSC significantly affects the energy performance of the MBS, whereas the other agent resizes boiler capacity by acting to adjust the boiler solenoid fuel valve. The comparison of results between the proposed strategy and conventional FBBC shows distinct differences in the superior response of DCRLBB under dynamic indoor/outdoor actual conditions and energy saving by more than 32% while maintaining the indoor thermal in the comfortable range.

Original languageEnglish
Article number122357
JournalApplied Energy
StatePublished - 15 Feb 2024

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Ltd


  • Control boiler systems
  • Deep clustering
  • Energy management
  • Lagrangian interpolation formula
  • Reinforcement learning agents
  • Smart buildings

ASJC Scopus subject areas

  • Building and Construction
  • Renewable Energy, Sustainability and the Environment
  • Mechanical Engineering
  • General Energy
  • Management, Monitoring, Policy and Law


Dive into the research topics of 'Deep clustering of reinforcement learning based on the bang-bang principle to optimize the energy in multi-boiler for intelligent buildings'. Together they form a unique fingerprint.

Cite this