Abstract
The role of hydrogen geo-storage and production in addressing global warming and energy demand concurrently cannot be understated. Diverse factors such as interfacial tension (IFT) and wettability influence safe and effective hydrogen geo-storage and production. The IFT controls the maximum H2 storage column height, capacity, and capillary entry pressure. Current laboratory experimental techniques for IFT determination of H2/cushion gas systems are resource-intensive. Nonetheless, the extensive experimental IFT data supports machine learning (ML) deployment to determine IFT time-efficiently and cost-effectively. Hence, this work evaluated the predictive capabilities of supervised ML paradigms including random forest, extra trees regression, gradient boosting regression (GBR), and light gradient boosting machine, wherein the novelty of the study lies. An extensive and comprehensive dataset comprising 2564 IFT instances was gathered from the literature, encompassing independent variables: pressure 0.10-45 MPa), temperature (20-176 °C), brine salinity (0-20 mol/kg), and hydrogen, methane, carbon dioxide, and nitrogen mole fractions (0-100 mol.%). The data was pre-processed and split into 70% for model training and 30% for testing. Statistical metrics and visual representations were utilized for quantitative and qualitative assessments of the models. The Leverage approach was subsequently applied to classify the different data categories and verify the statistical validity of the database and the reliability of constructed paradigms. The impact of the independent variables on IFT prediction was evaluated using Spearman correlation, permutation importance, and Shapley Additive Explanations (SHAP). Nitrogen and CO2 mole fractions demonstrated the least and greatest impact on H2/cushion gas/brine IFT based on correlation analysis, permutation importance, and SHAP. Generally, the developed paradigms successfully captured the underlying relationships between the independent variables and IFT, recording an overall R2 > 0.97, MAE < 1.30 mN/m, RMSE < 2 mN/m, and AARD < 2.3% Nonetheless, the GBR model demonstrated superior predictive performance, yielding the highest R2 and lowest MAE, RMSE, and AARD of 0.987, 0.507 mN/m, 0.901 mN/m, and 0.906%, respectively. GBR also provided more accurate IFT results for pure H2/water and H2/cushion gas systems than empirical and molecular dynamics-based correlations developed by other scholars. Only 0.43-2.11% of the dataset was outside the validity range, underscoring the statistical validity of the database and reliability of the models. The developed paradigms are beneficial tools in the toolbox of domain experts, which could fast-track workflows and minimize uncertainties surrounding conventional IFT determination techniques for aqueous H2 systems. This progress is promising for mitigating hydrogen loss and optimizing strategies in H2 geo-storage and production.
Original language | English |
---|---|
Title of host publication | International Petroleum Technology Conference, IPTC 2025 |
Publisher | International Petroleum Technology Conference (IPTC) |
ISBN (Electronic) | 9781959025436 |
DOIs | |
State | Published - 2025 |
Event | 2025 International Petroleum Technology Conference, IPTC 2025 - Kuala Lumpur, Malaysia Duration: 18 Feb 2025 → 20 Feb 2025 |
Publication series
Name | International Petroleum Technology Conference, IPTC 2025 |
---|
Conference
Conference | 2025 International Petroleum Technology Conference, IPTC 2025 |
---|---|
Country/Territory | Malaysia |
City | Kuala Lumpur |
Period | 18/02/25 → 20/02/25 |
Bibliographical note
Publisher Copyright:Copyright 2025, International Petroleum Technology Conference.
Keywords
- Ensemble learning
- Hydrogen geo-storage
- Interfacial tension
- Sensitivity analysis
- Supervised machine learning
ASJC Scopus subject areas
- Geochemistry and Petrology
- Fuel Technology