Tree-based machine learning models for predicting the maximum depth of corrosion defects based on historical in-line inspection data

Research output: Contribution to journalArticlepeer-review

Abstract

Oil and gas pipelines are the primary means of fluid transportation in the industry due to their efficiency, reliability, and cost-effectiveness. However, pipeline corrosion poses significant risks, leading to loss of containment, operational interruptions, and potential loss of life if undetected or unmitigated. The introduction of novel and corrosive fluid, such as hydrogen and carbon dioxide (CO2), is expected to exacerbate corrosion-related issues. Consequently, pipeline inspection techniques, particularly in-line inspection (ILI), are increasingly vital for corrosion monitoring. This paper establishes a framework for utilizing ILI data to develop machine learning models for corrosion prediction. Four tree-based machine learning techniques—eXtreme Gradient Boosting (XGBoost), Dropouts meet multiple Additive Regression Trees (DART), Light Gradient-Boosting Machine (LightGBM) with linear trees, and random forests—were employed to predict the maximum depth of corrosion defects based exclusively on historical ILI data. All models significantly outperformed the Naïve forecasting benchmark, with the best model achieving a root mean square error (RMSE) of 0.368 mm on the test set, surpassing the benchmark by 41.5 %. Furthermore, the accuracy of these models exceeded that of most service-based corrosion prediction models in the literature. This was achieved using an ILI dataset that exhibited extreme variations in maximum depth distribution and changes in reporting criteria. These findings indicate that the proposed framework is a superior alternative to service-based models, leveraging the vast amount of ILI data available to pipeline operators. Additionally, the service-agnostic framework supports the integration of process and service parameters, further enhancing the efficacy of machine learning models for corrosion prediction.

Original languageEnglish
Article number100308
JournalJournal of Pipeline Science and Engineering
DOIs
StateAccepted/In press - 2025

Bibliographical note

Publisher Copyright:
© 2025 The Authors

Keywords

  • In-line inspection
  • Machine learning
  • Maximum depth prediction
  • Pipeline corrosion
  • Statistical analysis
  • Supervised learning
  • Sustainability

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Energy (miscellaneous)
  • Mechanical Engineering
  • Fluid Flow and Transfer Processes

Fingerprint

Dive into the research topics of 'Tree-based machine learning models for predicting the maximum depth of corrosion defects based on historical in-line inspection data'. Together they form a unique fingerprint.

Cite this