Abstract
This study investigates whether state-of-the-art multimodal large language models (LLMs) can autonomously perform the entire Hazard and Operability Studies (HAZOP) process without human intervention. Four LLMs—GPT4o, GPT4o-mini, LLAMA, and Gemini—were utilized to generate automatic HAZOP worksheets spanning dozens of pages from an identical piping and instrumentation diagram (P&ID) using a standardized prompt. Their outputs were benchmarked against an expert-prepared refence worksheet and evaluated along two key aspects: (1) model performance, measured by similarity and computational cost and (2) HAZOP performance, measured by the validity of generated scenario and the diversity of safeguards. The results showed that all four LLMs achieved high similarity scores to the reference (F1 scores > 86 %). LLAMA was the most cost-efficient ($0.011 per worksheet), while Gemini generated the greatest number of scenarios (34.3 per worksheet) and safeguards (1.79 per deviation). This study presents a structured framework for evaluating LLMs in HAZOP and highlights their potential as assistive tools in the process safety field. However, key limitations were observed. The proportion of semantically valid scenarios remained low (0.19 to 0.37), and safeguards were heavily biased toward procedural measures, indicating limited diversity in risk-mitigation strategies. To enhance the reliability and practicability of LLM-based HAZOP studies, future research should focus on advanced prompt engineering, domain-specific fine-tuning, and improved reasoning capabilities.
| Original language | English |
|---|---|
| Article number | 107039 |
| Journal | Safety Science |
| Volume | 194 |
| DOIs | |
| State | Published - Feb 2026 |
Bibliographical note
Publisher Copyright:© 2025 Elsevier Ltd.
Keywords
- HAZOP
- Large language model
- Natural language processing
- Process hazard analysis
- Process safety
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Safety Research
- Public Health, Environmental and Occupational Health