Domain-Specific Data Augmentation for Lung Nodule Malignancy Classification.

Lung cancer is one of the leading causes of cancer-related deaths worldwide, mainly due to late diagnosis. Screening programs can benefit from Computer-Aided Diagnosis (CAD) systems that detect and classify lung nodules using Computed Tomography (CT) scans. A great proportion of the literature proposes deep learning models based on single and private datasets with no evaluation of their generalisation capability. The main goal of this work is to study and address the lack of generalisation to out-of-domain data (source domain different from the target domain). In this work, we propose using a ResNet architecture with 2.5D inputs capable of maintaining the spatial information of the nodules (3 input channels based on the anatomical planes). Secondly, we apply domain-specific data augmentation tailored for CT scans. Combined with data augmentation, using 2.5D inputs achieves the best results, both in in-domain data (LIDC-IDRI: N=1377 nodules; and LNDb: N=183 nodules) and in out-of-domain data (LUNGx: N=73 nodules). In in-domain data, an Area Under the Curve (AUC) of 0.914 was achieved in the internal test set and 0.746 in one of the external test sets. Notably, in out-of-domain data, where the ground-truth labels have been confirmed by biopsy, whereas the training data only involved radiologist annotation regarding the "likelihood of malignancy", AUC improves from 0.576 to 0.695, reaching a performance close to that of radiology experts. In the future, strategies should be applied to deal with the level of uncertainty of lung nodule annotations based solely on the observation of the CT scans.Clinical relevance- This work provides an automatic method for lung nodule malignancy classification based on CT scans, combined with generalisation methods that allow a good performance across different cohort populations and hospitals.
Cancer
Chronic respiratory disease
Care/Management

Authors

Gouveia Gouveia, Araujo Araujo, Oliveira Oliveira, Pereira Pereira
View on Pubmed
Share
Facebook
X (Twitter)
Bluesky
Linkedin
Copy to clipboard