Abstract
Lung cancer is one of the most deadly cancers in the world. Its mortality rate is high when the cancer is diagnosed late. Therefore, early detection is a crucial factor for an increase in survival rate, and lung cancer screening is one of the most important intervention tools. However, the screening would be cost-effective only when we can accurately select a sub-population which is at the most risk of lung cancer. It is hypothesised that this selection task can be done cost-effectively when we use clinical data (e.g. demographic, lifestyle and comorbidity variables) rather than low-dose CT. This work used the clinical data extracted from Clinical Practice Research Datalink (CPRD). The goal is to test whether this approach can achieve comparable or even better selection performance when compared to an alternative approach using clinical data from lung cancer screening trials. The latter approach is adopted in [54]. In this paper, we further adapt the logistic regression model for a joint classification and feature selection analysis. The model is implemented in an ‘ensemble learning’ manner to deal with severe ‘class imbalance’ problems. It is observed that the sensitivity and specificity results are slightly better than those reported in [54]. Also, we identified a comorbidity factor COPD and a smoking-related factor smk-status as the two most discriminative features.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of Trends in Electronics and Health Informatics - TEHI 2022 |
| Editors | Mufti Mahmud, Claudia Mendoza-Barrera, M. Shamim Kaiser, Anirban Bandyopadhyay, Kanad Ray, Eduardo Lugo |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 191-206 |
| Number of pages | 16 |
| ISBN (Print) | 9789819919154 |
| DOIs | |
| State | Published - 2023 |
| Externally published | Yes |
| Event | 2nd International Conference on Trends in Electronics and Health Informatics, TEHI 2022 - Puebla, Mexico Duration: 7 Dec 2022 → 9 Dec 2022 |
Publication series
| Name | Lecture Notes in Networks and Systems |
|---|---|
| Volume | 675 LNNS |
| ISSN (Print) | 2367-3370 |
| ISSN (Electronic) | 2367-3389 |
Conference
| Conference | 2nd International Conference on Trends in Electronics and Health Informatics, TEHI 2022 |
|---|---|
| Country/Territory | Mexico |
| City | Puebla |
| Period | 7/12/22 → 9/12/22 |
Bibliographical note
Publisher Copyright:© 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Keywords
- CPRD
- Cancer screening
- Classification
- Cost-effectiveness
- Early detection
- Feature selection
- Imbalanced classification
- Logistic regression
- Lung cancer
ASJC Scopus subject areas
- Control and Systems Engineering
- Signal Processing
- Computer Networks and Communications