Abstract
Clinical predictive models have played an important role in healthcare. An important task in lung cancer healthcare is to identify those participants involved in a screening program with higher lung cancer risk from a selected population. More interestingly, Electronic Healthcare Records (EHRs) data can be acquired from primary care and have been used to emulate a screening program. An example of such EHR dataset is Clinical Practice Research Datalink (CPRD) that covers 4.5% UK population. In this paper, we provide a worked example for such task while employing Explainable Boosting Machine (EBM) as the predictive model and using CPRD dataset as the EHRs. EBM is a prominent example of inherently interpretable models (i.e., IIM). IIMs can predict target variables and model explanation simultaneously. More importantly, EBMs represent a family of non-linear IIMs. This kind of generalisation presents a significant extension of logistic regression. EBMs have been developed as an end-to-end system at Microsoft Research. It provide powerful visualisation tools for evaluating both model prediction and explanation. On the other hand, EBM users like to know more technical details about EBM itself. Thus, we provide a brief introduction to Generalised Additive Model, Gradient Boosting, Boosted Trees, and Bagging Ensemble. Finally, we further provide two EBM-based Use Cases in healthcare domain as well as an illustrative example of lung cancer prediction and explanation.
| Original language | English |
|---|---|
| Title of host publication | Applied Intelligence and Informatics - 4th International Conference, AII 2024, Revised Selected Papers |
| Editors | Mufti Mahmud, M. Shamim Kaiser, Joarder Kamruzzaman, Khan Iftekharuddin, Md Atiqur Rahman Ahad, Ning Zhong |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 184-199 |
| Number of pages | 16 |
| ISBN (Print) | 9783032046567 |
| DOIs | |
| State | Published - 2025 |
| Event | 4th International Conference on Applied Intelligence and Informatics, AII 2024 - London, United Kingdom Duration: 18 Dec 2024 → 20 Dec 2024 |
Publication series
| Name | Communications in Computer and Information Science |
|---|---|
| Volume | 2607 CCIS |
| ISSN (Print) | 1865-0929 |
| ISSN (Electronic) | 1865-0937 |
Conference
| Conference | 4th International Conference on Applied Intelligence and Informatics, AII 2024 |
|---|---|
| Country/Territory | United Kingdom |
| City | London |
| Period | 18/12/24 → 20/12/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Keywords
- Collinearity
- Feature Importance Measures
- Feature Selection
- Greedy Approach to Boosting
- Ischemic Heart Disease
- Lung Cancer
- Missing Values
- Rectal Cancer
- Round-Robin Cycle
ASJC Scopus subject areas
- General Computer Science
- General Mathematics