Integrating Predictive Analytics and Explainable AI for Lung Cancer Detection within the One Health Framework

Authors

  • B. Pavani Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India
  • K. Tejasree Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India
  • K. Siva Karthik Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India
  • T. Kavyanjali Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India

DOI:

https://doi.org/10.5281/zenodo.19053091

Keywords:

Sustainable agriculture, Crop yield prediction, Machine learning (ML), Deep learning, Explainable artificial intelligence (XAI), LIME (Local Interpretable Model-Agnostic Explanations), SHAP (Shapley Additive explanations), Model interpretability, Data quality, Model generalizability.

Abstract

Lung cancer is one of the biggest causes of cancer death worldwide, mainly because it often goes undiagnosed until it is too late for effective treatment. Most patients will not have signs or symptoms of lung cancer until well into the disease, when treatment is likely less effective. Once cancer is diagnosed, the chance of survival decreases significantly, and treatment becomes more aggressive, costly, and uncertain. Even with new imaging technologies and improved ways of diagnosing lung cancer, early detection of the disease is still extremely difficult because of the many different genetic, environmental, and lifestyle factors that contribute to lung cancer. For these reasons, we are taking a holistic One Health approach to this research. The focus of this project will be to study and analyze how the interrelationships between health, environmental exposure, and lifestyle choices (e.g., smoking, air pollution, occupational exposure, socioeconomic status) contribute to the development and progression of lung cancer. This study takes into account all of these various elements and moves away from lung cancer research that views lung cancer in only a clinical context to understand it in an ecological and systems-oriented perspective. The main objective of this research is to utilize predictive analytics and machine learning (ML) techniques to analyze health data to determine the factors associated with an increased risk for developing lung cancer. ML and predictive analytics are technologies that allow us to find hidden patterns and non-linear relationships in large, complex data sets. Several different ML algorithms will be used to predict lung cancer, including classification models like Logistic Regression, Random Forest, and Gradient Boosting Machines.

References

Ardila, D., et al. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x

Coudray, N., et al. (2018). Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nature Medicine, 24(10), 1559–1567. https://doi.org/10.1038/s41591-018-0177-5

Esteva, A., et al. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29. https://doi.org/10.1038/s41591-018-0316-z

Bibault, B., Giraud, P., & Burgun, A. (2016). Big data and machine learning in radiation oncology: State of the art and future prospects. Cancer Letters, 382(1), 110–117. https://doi.org/10.1016/j.canlet.2016.05.033

Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44–56. https://doi.org/10.1038/s41591-018-0300-7

Lundberg, S., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). https://doi.org/10.1145/2939672.2939778

Rajpurkar, H., et al. (2017). CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv. https://arxiv.org/abs/1711.05225

Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246. https://doi.org/10.1093/bib/bbx044

Subramanian, A., et al. (2020). Multi-modal machine learning for lung cancer survival prediction. IEEE Journal of Biomedical and Health Informatics, 24(9), 2506–2515. https://doi.org/10.1109/JBHI.2020.2968120

Islam, A. V. B. M., et al. (2021). Explainable artificial intelligence in healthcare: A systematic review. IEEE Access, 9, 111112–111132. https://doi.org/10.1109/ACCESS.2021.3102929

Purohit, P. M. K., et al. (2020). Machine learning-based lung cancer detection using clinical and radiological data. IEEE Access, 8, 180857–180868. https://doi.org/10.1109/ACCESS.2020.3027967

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90

Litjens, G., et al. (2017). A survey on deep learning in medical image analysis. Medical Image Analysis, 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005

Shen, D., Wu, G., & Suk, H. I. (2017). Deep learning in medical image analysis. Annual Review of Biomedical Engineering, 19, 221–248. https://doi.org/10.1146/annurev-bioeng-071516-044442

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785

Esteva, J., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118. https://doi.org/10.1038/nature21056

Greenspan, H., van Ginneken, B., & Summers, R. M. (2016). Deep learning in medical imaging: Overview and future promise. IEEE Transactions on Medical Imaging, 35(5), 1153–1159. https://doi.org/10.1109/TMI.2016.2553401

Samek, W., Wiegand, T., & Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. IEEE Signal Processing Magazine, 34(6), 56–65. https://doi.org/10.1109/MSP.2017.2745055

Choi, L. Y., et al. (2020). Deep learning-based lung cancer detection and classification using chest CT images. IEEE Access, 8, 182838–182848. https://doi.org/10.1109/ACCESS.2020.3029086

Armato, S., et al. (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2), 915–931. https://doi.org/10.1118/1.3528204

Ardila, D., et al. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x

Holzinger, A., Biemann, C., Pattichis, C. S., & Kell, D. B. (2017). What do we need to build explainable AI systems for the medical domain? arXiv. https://arxiv.org/abs/1712.09923

Campanella, J., et al. (2019). Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nature Medicine, 25(8), 1301–1309. https://doi.org/10.1038/s41591-019-0508-1

Downloads

Published

2026-03-08