Predictive machine learning algorithms for depression and anxiety disorders in six cancer types: a comprehensive multi-center population-based study.
Machine learning (ML) has advanced predictive modeling in medical diagnosis and risk assessment through large clinical datasets, yet applications for predicting post-cancer depression and anxiety remain limited. This cross-institutional, longitudinal study aims to develop ML models to predict depression and anxiety in cancer patients and to identify key contributing factors.
ML models were developed and validated to predict depression and anxiety disorders within one year of six cancer diagnoses across three medical centers in Taiwan, from January 2017 to December 2022 (n = 24,580). Study variables included demographic, clinical, and quality of care attributes. ML algorithms employed include Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), Adaptive Boosting (AdaBoost), and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, precision, recall, F1 score, and area under the receiver operating characteristic (AUROC) curve. Statistical analysis was conducted with SPSS 26.0 and Spyder 3.8 (Python).
The XGBoost model significantly outperformed others, achieving 98.30% accuracy, 98.17% precision, 98.30% F1 score, and 99.72% AUROC (P < 0.001). Feature importance analysis identified tumor size, age, and body mass index as the top risk factors for depression and anxiety disorders within one year of a cancer diagnosis.
ML algorithms advance the understanding of depression and anxiety in cancer patients by leveraging longitudinal data to identify key predictive factors. These models not only enhance mental health care and treatment quality but also provide insights that inform evidence-based guidelines, thereby improving outcomes for patients, families, and healthcare providers during cancer care.
ML models were developed and validated to predict depression and anxiety disorders within one year of six cancer diagnoses across three medical centers in Taiwan, from January 2017 to December 2022 (n = 24,580). Study variables included demographic, clinical, and quality of care attributes. ML algorithms employed include Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), Adaptive Boosting (AdaBoost), and Extreme Gradient Boosting (XGBoost). Model performance was evaluated using accuracy, precision, recall, F1 score, and area under the receiver operating characteristic (AUROC) curve. Statistical analysis was conducted with SPSS 26.0 and Spyder 3.8 (Python).
The XGBoost model significantly outperformed others, achieving 98.30% accuracy, 98.17% precision, 98.30% F1 score, and 99.72% AUROC (P < 0.001). Feature importance analysis identified tumor size, age, and body mass index as the top risk factors for depression and anxiety disorders within one year of a cancer diagnosis.
ML algorithms advance the understanding of depression and anxiety in cancer patients by leveraging longitudinal data to identify key predictive factors. These models not only enhance mental health care and treatment quality but also provide insights that inform evidence-based guidelines, thereby improving outcomes for patients, families, and healthcare providers during cancer care.
Authors
Ling Ling, Wang Wang, Chung Chung, Tsai Tsai, Chen Chen, Chen Chen, Hsu Hsu, Shi Shi
View on Pubmed