Integrating Predictive Analytics and Explainable AI for Lung Cancer Detection within the One Health Framework
DOI:
https://doi.org/10.5281/zenodo.19053091Keywords:
Sustainable agriculture, Crop yield prediction, Machine learning (ML), Deep learning, Explainable artificial intelligence (XAI), LIME (Local Interpretable Model-Agnostic Explanations), SHAP (Shapley Additive explanations), Model interpretability, Data quality, Model generalizability.Abstract
Lung cancer is one of the biggest causes of cancer death worldwide, mainly because it often goes undiagnosed until it is too late for effective treatment. Most patients will not have signs or symptoms of lung cancer until well into the disease, when treatment is likely less effective. Once cancer is diagnosed, the chance of survival decreases significantly, and treatment becomes more aggressive, costly, and uncertain. Even with new imaging technologies and improved ways of diagnosing lung cancer, early detection of the disease is still extremely difficult because of the many different genetic, environmental, and lifestyle factors that contribute to lung cancer. For these reasons, we are taking a holistic One Health approach to this research. The focus of this project will be to study and analyze how the interrelationships between health, environmental exposure, and lifestyle choices (e.g., smoking, air pollution, occupational exposure, socioeconomic status) contribute to the development and progression of lung cancer. This study takes into account all of these various elements and moves away from lung cancer research that views lung cancer in only a clinical context to understand it in an ecological and systems-oriented perspective. The main objective of this research is to utilize predictive analytics and machine learning (ML) techniques to analyze health data to determine the factors associated with an increased risk for developing lung cancer. ML and predictive analytics are technologies that allow us to find hidden patterns and non-linear relationships in large, complex data sets. Several different ML algorithms will be used to predict lung cancer, including classification models like Logistic Regression, Random Forest, and Gradient Boosting Machines.
References
Ardila, D., et al. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x
Coudray, N., et al. (2018). Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nature Medicine, 24(10), 1559–1567. https://doi.org/10.1038/s41591-018-0177-5
Esteva, A., et al. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29. https://doi.org/10.1038/s41591-018-0316-z
Bibault, B., Giraud, P., & Burgun, A. (2016). Big data and machine learning in radiation oncology: State of the art and future prospects. Cancer Letters, 382(1), 110–117. https://doi.org/10.1016/j.canlet.2016.05.033
Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44–56. https://doi.org/10.1038/s41591-018-0300-7
Lundberg, S., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). https://doi.org/10.1145/2939672.2939778
Rajpurkar, H., et al. (2017). CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv. https://arxiv.org/abs/1711.05225
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246. https://doi.org/10.1093/bib/bbx044
Subramanian, A., et al. (2020). Multi-modal machine learning for lung cancer survival prediction. IEEE Journal of Biomedical and Health Informatics, 24(9), 2506–2515. https://doi.org/10.1109/JBHI.2020.2968120
Islam, A. V. B. M., et al. (2021). Explainable artificial intelligence in healthcare: A systematic review. IEEE Access, 9, 111112–111132. https://doi.org/10.1109/ACCESS.2021.3102929
Purohit, P. M. K., et al. (2020). Machine learning-based lung cancer detection using clinical and radiological data. IEEE Access, 8, 180857–180868. https://doi.org/10.1109/ACCESS.2020.3027967
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
Litjens, G., et al. (2017). A survey on deep learning in medical image analysis. Medical Image Analysis, 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005
Shen, D., Wu, G., & Suk, H. I. (2017). Deep learning in medical image analysis. Annual Review of Biomedical Engineering, 19, 221–248. https://doi.org/10.1146/annurev-bioeng-071516-044442
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785
Esteva, J., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118. https://doi.org/10.1038/nature21056
Greenspan, H., van Ginneken, B., & Summers, R. M. (2016). Deep learning in medical imaging: Overview and future promise. IEEE Transactions on Medical Imaging, 35(5), 1153–1159. https://doi.org/10.1109/TMI.2016.2553401
Samek, W., Wiegand, T., & Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. IEEE Signal Processing Magazine, 34(6), 56–65. https://doi.org/10.1109/MSP.2017.2745055
Choi, L. Y., et al. (2020). Deep learning-based lung cancer detection and classification using chest CT images. IEEE Access, 8, 182838–182848. https://doi.org/10.1109/ACCESS.2020.3029086
Armato, S., et al. (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2), 915–931. https://doi.org/10.1118/1.3528204
Ardila, D., et al. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x
Holzinger, A., Biemann, C., Pattichis, C. S., & Kell, D. B. (2017). What do we need to build explainable AI systems for the medical domain? arXiv. https://arxiv.org/abs/1712.09923
Campanella, J., et al. (2019). Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nature Medicine, 25(8), 1301–1309. https://doi.org/10.1038/s41591-019-0508-1
Downloads
Published
Issue
Section
License
Copyright (c) 2026 B. Pavani, K. Tejasree, K. Siva Karthik, T. Kavyanjali

This work is licensed under a Creative Commons Attribution 4.0 International License.