Trustworthy assessment of 2D model for lung CT scans.
Immunotherapy (IO) has revolutionized the treatment of non-small cell lung cancer (NSCLC). However, determining the best candidates for this therapy is still a challenge. Currently, biomarkers such as PD-L1 are used in clinical practice to guide treatment decisions, but their predictive power is limited. Therefore, there is an urgent need for new models to more accurately identify which patients will benefit most from immunotherapy.In this context, Artificial Intelligence (AI) has shown promising results in deriving novel data-driven biomarkers from medical imaging, offering a promising approach to enhance patient selection and treatment stratification. These AI-driven biomarkers have not yet been widely adopted in clinical practice due to various concerns, including the time required for radiologists to perform segmentation. In this study, a 2D ResNet50 model, pre-trained on 1.35 million of radiologic images, was employed to process 2D lung CT scans from patients with advanced NSCLC treated at Fondazione IRCCS Istituto Nazionale dei Tumori in Milan. The model was designed to predict poor responders, defined as patients with an overall survival of less than six months, using radiological images acquired before the initiation of IO treatment (baseline CT scans). Our model achieved an F1-score of 0.74 on the test set. To assess the model's robustness, a fairness evaluation was conducted across different demographic subgroups, specifically sex, and age, and a two-sample independent t-test was performed to assess statistical differences between these groups. Our analysis highlights fairness concerns within the model predictions, with significant p-values (p < 0.05), suggesting that sex and age may be confounding factors for the model prediction. Further investigations are required to mitigate these biases and ensure equitable model performance across diverse patient populations.Clinical relevance - The model identifies poor responders (patients with an overall survival of less than 6 months), potentially preventing unnecessary IO administration in NSCLC patients unlikely to benefit from the therapy. Additionally, it evaluates how variations in data distribution could impact the model performance.
Authors
Favali Favali, Miskovic Miskovic, Eusebio Eusebio, Trovo Trovo, Provenzano Provenzano, Guirges Guirges, Ruggirello Ruggirello, Garassino Garassino, Proto Proto, Russo Russo, Prelaj Prelaj, Pedrocchi Pedrocchi
View on Pubmed