Machine Learning-Based Multidimensional Oximetry for Obstructive Sleep Apnea Screening: Development and External Validation.
Obstructive sleep apnea (OSA) affects nearly one billion people globally and poses a substantial public health threat. Effective and accessible methods for OSA risk identification are urgently needed.
This study aims to develop and externally validate a machine learning model derived from multi-parameter pulse oximetry (SpO2) for OSA screening, and to evaluate its performance, interpretability, and robustness across sex and age subgroups.
Of 4156 screened participants, 2195 underwent polysomnography (internal cohort) and 446 received home sleep apnea testing (external cohort). Eight SpO2-derived parameters, including oxygen desaturation index (ODI), hypoxic burden (HB), and ST90 (percentage of sleep time with SpO2 < 90%), were used to construct models. Six machine learning algorithms were trained, with F1-score as the primary metric and area under the curve as the secondary metric. Model interpretability was assessed using Shapley additive explanations and intrinsic feature importance scores.
Nonlinear parameter-risk relationships were observed between oximetry indices and OSA probability. The 4-parameter ODI-HB-MinSpO2-ST90 model achieved optimal performance (F1-score = 0.9516, area under the curve = 0.9879), surpassing all single-parameter models. Shapley additive explanations analysis identified ODI, HB, and MinSpO2 as key predictors. The ODI-HB-MinSpO2-MeanSpO2 configuration demonstrated superior performance in female and younger subgroups, whereas the ODI-HB-MinSpO2-ST90 model remained optimal for male and older participants. Categorical boosting outperformed other algorithms across multiple metrics and remained robust in both subgroup and external validation analyses.
The multi-parameter oximetry model based on the categorical boosting algorithm provides a simple and accurate tool for OSA screening. Sex- and age-stratified strategies can further enhance its clinical applicability.
This study aims to develop and externally validate a machine learning model derived from multi-parameter pulse oximetry (SpO2) for OSA screening, and to evaluate its performance, interpretability, and robustness across sex and age subgroups.
Of 4156 screened participants, 2195 underwent polysomnography (internal cohort) and 446 received home sleep apnea testing (external cohort). Eight SpO2-derived parameters, including oxygen desaturation index (ODI), hypoxic burden (HB), and ST90 (percentage of sleep time with SpO2 < 90%), were used to construct models. Six machine learning algorithms were trained, with F1-score as the primary metric and area under the curve as the secondary metric. Model interpretability was assessed using Shapley additive explanations and intrinsic feature importance scores.
Nonlinear parameter-risk relationships were observed between oximetry indices and OSA probability. The 4-parameter ODI-HB-MinSpO2-ST90 model achieved optimal performance (F1-score = 0.9516, area under the curve = 0.9879), surpassing all single-parameter models. Shapley additive explanations analysis identified ODI, HB, and MinSpO2 as key predictors. The ODI-HB-MinSpO2-MeanSpO2 configuration demonstrated superior performance in female and younger subgroups, whereas the ODI-HB-MinSpO2-ST90 model remained optimal for male and older participants. Categorical boosting outperformed other algorithms across multiple metrics and remained robust in both subgroup and external validation analyses.
The multi-parameter oximetry model based on the categorical boosting algorithm provides a simple and accurate tool for OSA screening. Sex- and age-stratified strategies can further enhance its clinical applicability.