Exploring predictive factors of physiological, biochemical indicators, and lifestyle for macrovascular complications in type 2 diabetes: a synthesis of machine learning models.
Traditional risk models for macrovascular complications in type 2 diabetes (T2DM) rely on physiological and biochemical indicators, which may lack long-term follow-up data and thus potentially overlook key variables.
A retrospective cohort study was conducted on 4,186 T2DM patients from the Diabetes Health Management Platform in Hotan, Xinjiang, covering the period from 2015 to 2023. Eight machine learning (ML) algorithms were used, with an 8:2 random split into training (n=3,348) and validation (n=838) sets. Performance was evaluated using the area under the receiver operating characteristic curve (AUC), and feature contributions were analyzed using SHAP values. The clinical applicability was verified through decision curve analysis.
The T2DM with macrovascular complications group had significantly higher waist circumference, oropharyngeal abnormalities, and absent lung crackles (P < 0.05). The T2DM with macrovascular complications group also had significantly higher BMI, body temperature, and ALT (P < 0.05), but lower fasting blood glucose, with borderline abnormalities in blood urea and AST. The T2DM with macrovascular complications group had higher smoking, alcohol consumption, and exercise frequency (P < 0.05), but a reverse trend in self-reported "poor" health status (P < 0.05). Among all machine learning models (training AUC 0.68-0.85), XGBoost performed best (training AUC = 0.830, validation AUC = 0.850), with superior clinical net benefit compared to traditional strategies. SHAP analysis revealed that BMI (contribution +0.1116), body temperature (+0.0923), and LDL-C (+0.0821) were key predictive factors, with elevated body temperature potentially indicating subclinical inflammation activation.
Among patients with vascular complications, the disconnect between health behavior risks and subjective health perception is more pronounced. Elevated body temperature, high blood pressure, triglycerides, and fasting glucose indicate inflammation, increasing cardiovascular risk; moderate regular exercise provides protection.
A retrospective cohort study was conducted on 4,186 T2DM patients from the Diabetes Health Management Platform in Hotan, Xinjiang, covering the period from 2015 to 2023. Eight machine learning (ML) algorithms were used, with an 8:2 random split into training (n=3,348) and validation (n=838) sets. Performance was evaluated using the area under the receiver operating characteristic curve (AUC), and feature contributions were analyzed using SHAP values. The clinical applicability was verified through decision curve analysis.
The T2DM with macrovascular complications group had significantly higher waist circumference, oropharyngeal abnormalities, and absent lung crackles (P < 0.05). The T2DM with macrovascular complications group also had significantly higher BMI, body temperature, and ALT (P < 0.05), but lower fasting blood glucose, with borderline abnormalities in blood urea and AST. The T2DM with macrovascular complications group had higher smoking, alcohol consumption, and exercise frequency (P < 0.05), but a reverse trend in self-reported "poor" health status (P < 0.05). Among all machine learning models (training AUC 0.68-0.85), XGBoost performed best (training AUC = 0.830, validation AUC = 0.850), with superior clinical net benefit compared to traditional strategies. SHAP analysis revealed that BMI (contribution +0.1116), body temperature (+0.0923), and LDL-C (+0.0821) were key predictive factors, with elevated body temperature potentially indicating subclinical inflammation activation.
Among patients with vascular complications, the disconnect between health behavior risks and subjective health perception is more pronounced. Elevated body temperature, high blood pressure, triglycerides, and fasting glucose indicate inflammation, increasing cardiovascular risk; moderate regular exercise provides protection.