Interpretable LightGBM model with SHAP analysis predicts non-excellent response to initial radioiodine therapy in differentiated thyroid carcinoma.
To identify independent determinants influencing therapeutic outcomes of initial radioactive iodine (1 3 1I) therapy in differentiated thyroid carcinoma (DTC) and establish an interpretable predictive framework for clinical decision-making.
A retrospective cohort of 950 treatment-naïve DTC patients undergoing primary 1 3 1I therapy was randomly allocated into training (n = 664) and testing (n = 286) cohorts. Multivariable logistic regression (LR) analysis identified response-associated variables, followed by the development of seven machine learning (ML) architectures including decision trees (DT), LR, random forests (RF), support vector machines (SVM), adaptive boosting (AdaBoost), eXtreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). Model performance was systematically evaluated through ROC curves (AUC), calibration plots, and decision curve analysis (DCA), complemented by SHapley Additive exPlanationgs (SHAP) interpretability framework for optimal model explanation.
Multivariable analysis demonstrated age, 1 3 1I therapeutic interval (TI), tumor multiplicity, maximum tumor diameter (MTD), lymph node metastasis (LNM) topography, stimulated thyroglobulin (sTg), thyroglobulin antibody (TgAb) levels, administered activity (AA), and post-therapy whole-body scan (Rx-WBS) findings as independent predictors of non-excellent response (N-ER).The LightGBM architecture achieved superior predictive accuracy (AUC = 0.896) in the testing cohort, outperforming conventional LR (AUC = 0.842).SHAP interpretation identified sTg (mean absolute SHAP value = 1.285), TgAb (mean absolute SHAP value = 0.642), and LNM topography (mean absolute SHAP value = 0.308) as principal predictive determinants. DCA confirmed that the model had a net benefit higher than the "full treatment" and "no treatment" strategies at multiple threshold probabilities, indicating its certain application value in clinical decision-making.
The developed LightGBM framework precisely predicts primary 1 3 1I therapeutic efficacy in DTC patients, with SHAP-driven elucidation of clinical risk factor contributions enabling personalized therapeutic paradigms.Future integration of multicenter prospective data and molecular biomarkers is required to enhance model generalizability.
A retrospective cohort of 950 treatment-naïve DTC patients undergoing primary 1 3 1I therapy was randomly allocated into training (n = 664) and testing (n = 286) cohorts. Multivariable logistic regression (LR) analysis identified response-associated variables, followed by the development of seven machine learning (ML) architectures including decision trees (DT), LR, random forests (RF), support vector machines (SVM), adaptive boosting (AdaBoost), eXtreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). Model performance was systematically evaluated through ROC curves (AUC), calibration plots, and decision curve analysis (DCA), complemented by SHapley Additive exPlanationgs (SHAP) interpretability framework for optimal model explanation.
Multivariable analysis demonstrated age, 1 3 1I therapeutic interval (TI), tumor multiplicity, maximum tumor diameter (MTD), lymph node metastasis (LNM) topography, stimulated thyroglobulin (sTg), thyroglobulin antibody (TgAb) levels, administered activity (AA), and post-therapy whole-body scan (Rx-WBS) findings as independent predictors of non-excellent response (N-ER).The LightGBM architecture achieved superior predictive accuracy (AUC = 0.896) in the testing cohort, outperforming conventional LR (AUC = 0.842).SHAP interpretation identified sTg (mean absolute SHAP value = 1.285), TgAb (mean absolute SHAP value = 0.642), and LNM topography (mean absolute SHAP value = 0.308) as principal predictive determinants. DCA confirmed that the model had a net benefit higher than the "full treatment" and "no treatment" strategies at multiple threshold probabilities, indicating its certain application value in clinical decision-making.
The developed LightGBM framework precisely predicts primary 1 3 1I therapeutic efficacy in DTC patients, with SHAP-driven elucidation of clinical risk factor contributions enabling personalized therapeutic paradigms.Future integration of multicenter prospective data and molecular biomarkers is required to enhance model generalizability.