An enhanced deep reinforcement learning approach for efficient, effective, and equitable disaster relief distribution

  • Moiz Ahmad
  • , Muhammad Tayyab
  • , Muhammad Salman Habib*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Efficient disaster response, especially within the critical initial 72 h, is crucial for saving lives. However, allocating relief goods effectively to affected areas remains a complex challenge due to uncertainty, limited resources, and dynamic needs. This study addresses this challenge by proposing a multi-period integer nonlinear programming model for efficient, effective, and equitable distribution of relief goods during disaster response phase. To optimize relief allocation within entire 72-h, a novel decision-making approach is proposed that leverages the proximal policy optimization (PPO) algorithm. It uses deep residual neural networks for state-value and optimal action prediction with 5 value and 4 policy residual layers. Additionally, an algorithm-agnostic termination criterion based on episodic reward stall ensures effective convergence detection without requiring prior knowledge of optimal solution. The provided model and solution methods are validated through 30 hypothetical problem instances and a realistic earthquake response case study. The results demonstrate the superiority of proposed approach compared to traditional methods like dynamic programming, state-action-reward-state-action (SARSA), and Q-learning, in terms of both solution quality and sample efficiency. Notably, the deep residual networks and proposed termination criterion enable the PPO algorithm to achieve an average optimality gap of less than 10% for the majority of instances with consistent hyperparameters, while exhibiting significant sample efficiency gains, particularly for large-scale problems. This research empowers disaster managers with an efficient and timely relief delivery plan, ultimately contributing to saving lives in the face of disaster. Moreover, proposed termination criterion may improve the performance of reinforcement learning in other application areas.

Original languageEnglish
Article number110002
JournalEngineering Applications of Artificial Intelligence
Volume143
DOIs
StatePublished - 1 Mar 2025

Bibliographical note

Publisher Copyright:
© 2025 Elsevier Ltd

Keywords

  • Disaster response
  • Proximal policy optimization
  • Q-learning
  • Reinforcement learning
  • Relief distribution
  • Solution quality

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'An enhanced deep reinforcement learning approach for efficient, effective, and equitable disaster relief distribution'. Together they form a unique fingerprint.

Cite this