Application of machine learning models to predict cytotoxicity of ionic liquids using VolSurf principal properties

Grace Amabel Tabaaza, Bennet Nii Tackie-Otoo, Dzulkarnain B. Zaini, Daniel Asante Otchere, Bhajan Lal*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Ionic Liquids (ILs) are considered greener alternatives to traditional organic solvents due to their unique physical and chemical properties. Nevertheless, recent studies showed that ILs can induce toxic effects in ecosystem. Therefore, it is essential to determine the level of risk to the aquatic life to successfully use these ILs. Toxicity measurement of various ILs on a broad spectrum of conditions through experimental techniques is way demanding on time, resources, and is at times impractical. Various research works have been performed in Quantitative Property Relationship (QSAR/QSPR) for IL toxicity prediction expressed as EC50. In this study, five supervised machine learning models were trained and tested using nine Principal Properties (PPs) as descriptors to predict leukemia rat cell line (IPC-81) cytotoxicity. Then eight feature selection techniques were used to preprocess the data to improve the performance of the best machine learning model among the preliminary trained models. Analysis of the performance of the models on predicting the out-of-sample data set showed that the Extreme Gradient Boosting (XGBoost) supervised machine learning model is the best in predicting with the highest test score (R2 = 0.79). This model was the most parsimonious (minimum AIC of 46.50), consistent (minimum RMSE of 0.45), and precise (minimum MAE of 0.32) in predicting IPC-81 cytotoxicity. The feature importance attribute of XGBoost confirmed that the structural features of ILs’ cation like cationic hydrophilicity and the side chain length have significant impact on the toxicity. Nevertheless, the anionic part of IL is also important to their toxicity and needs to be considered in toxicity prediction. Among the tested feature selection techniques, the random forest technique was the best in improving model performance (i.e., the least error matrices: AIC = 41.22, MAE = 0.31 and RMSE = 0.4259 respectively) but at longer execution time. However, the wrapper methods were the most robust in improving computational efficiency (i.e, improved the model performance at the shortest execution time). Therefore, this study improves QSPR studies on toxicity prediction of new ILs with the application of machine learning and feature selection techniques.

Original languageEnglish
Article number100266
JournalComputational Toxicology
Volume26
DOIs
StatePublished - May 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023

Keywords

  • Cytotoxicity
  • Ionic liquids
  • Leukemia rat cell line (IPC-81)
  • Machine learning
  • QSPR/QSAR and principal properties

ASJC Scopus subject areas

  • Toxicology
  • Computer Science Applications
  • Health, Toxicology and Mutagenesis

Fingerprint

Dive into the research topics of 'Application of machine learning models to predict cytotoxicity of ionic liquids using VolSurf principal properties'. Together they form a unique fingerprint.

Cite this