Machine learning-enhanced plasma proteomics discriminates pancreatic cancer-associated diabetes from type 2 diabetes mellitus.
Pancreatic ductal adenocarcinoma (PDAC) is frequently preceded by new-onset diabetes mellitus (NODM), yet differentiating PDAC-associated DM from type 2 diabetes (T2D) remains clinically challenging. We investigated whether plasma proteomic profiling combined with machine learning could discriminate these conditions. Plasma samples from individuals with PDAC (with and without DM), long-standing T2D, and controls were analyzed by MALDI-TOF mass spectrometry. Spectral features were processed through a nested cross-validation framework to prevent data leakage, and model interpretability was explored using SHAP values. In parallel, low-molecular-weight proteins were characterized by GeLC-MS followed by LC-MS/MS and differential abundance analysis. Machine learning models distinguished PDAC-associated DM from T2D with a balanced accuracy of 85%. Proteomic analyses identified distinct signatures in PDAC- associated DM, including downregulation of erythrocyte-related proteins and PPBP, and upregulation of acute-phase reactants such as FGA, CP, and SERPINA3. Treatment-naïve cases displayed increased circulating epithelial and keratin-associated proteins, which were attenuated after therapy, suggesting dynamic tumor-related remodeling. These findings demonstrate that integrating MALDI-TOF profiling with machine learning can capture plasma signatures associated with PDAC-associated DM. Although exploratory, this approach supports further validation in prospective cohorts aimed at improving PDAC risk stratification among individuals with NODM. SIGNIFICANCE: Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal malignancy with a dismal 5-year survival rate, primarily due to late-stage diagnosis. The frequent occurrence of new-onset diabetes mellitus (NODM) as a paraneoplastic syndrome offers a critical window for early detection. However, the clinical challenge of distinguishing PDAC-associated diabetes (PDAC-DM) from type 2 diabetes mellitus (T2D) has hindered the implementation of effective screening strategies. This study addresses this significant clinical problem by leveraging a multi-faceted proteomics approach. We demonstrate that the integration of MALDI-TOF mass spectrometry peptide profiling with machine learning algorithms can accurately discriminate PDAC-DM from T2D with 85% accuracy. Furthermore, we used LC-MS/MS to identify specific low molecular weight proteins that are differentially regulated between these conditions, providing a molecular basis for the observed discrimination. Our work is significant as it presents a novel, high-throughput pipeline for biomarker discovery that combines the scalability of MALDI-TOF with the analytical power of LC-MS/MS and machine learning. The identified plasma signatures hold strong translational potential to improve risk stratification in patients with new-onset diabetes, ultimately enabling earlier diagnosis of PDAC and improving patient survival prospects. This research directly contributes to the field of clinical proteomics by providing a robust methodological framework and candidate biomarkers for the early detection of one of oncology's most challenging diseases.
Authors
Lazari Lazari, Donnarumma Donnarumma, Matheus Matheus, D'Alpino Peixoto D'Alpino Peixoto, de Matos de Matos, Valerio Valerio, Rosa-Fernandes Rosa-Fernandes, Oba-Shinjo Oba-Shinjo, Machado Machado, Machado Machado, Marie Marie, Correa-Giannella Correa-Giannella, Palmisano Palmisano
View on Pubmed