• Navigating Severe Class Imbalance in Population Cohort Data.
    3 weeks ago
    Class imbalance is a major challenge in predictive modelling for rare disease outcomes, particularly in large-scale population cohorts. Traditional machine learning models often struggle with imbalanced datasets, leading to biased performance metrics and poor generalisability. This study systematically evaluates multiple approaches to mitigate class imbalance in predicting Multiple myeloma using proteomic and clinical data from UK Biobank. We compare standard classification models (XGBoost and logistic regression) with synthetic resampling (SMOTE), anomaly detection techniques (isolation forests, local outlier factors, one-class SVM, and autoencoders), and a transformer-based foundation model (TabPFN), using standard classification performance metrics. Our results indicate that anomaly detection models generalise better than conventional classifiers (XGBoost and logistic regression), while SMOTE fails to improve, and may actively worsen, predictive performance. To address the precision-sensitivity trade-off, we introduce a sequential XGBoost ensemble classifier (SeqXGB) that prioritises high precision over sensitivity to minimise false positive predictions. Compared with a single XGBoost model, the SeqXGB approach successfully reduces false positives (420 vs 9), but significantly limits sensitivity (0.70 vs 0.15) in held-out test data. Our findings highlight that no single method is universally optimal for addressing class imbalance; rather, model selection should be guided by clinical application, balancing the risks of false positives and false negatives.Clinical relevance- This study highlights the challenges with using machine learning to predict diseases in highly imbalanced large population cohorts and underscores the need to consider clinical purpose (e.g. screening vs diagnosis) when evaluating models.
    Cancer
    Cardiovascular diseases
    Access
    Care/Management
    Advocacy
  • Behavioral and Psycho-Emotional mHealth Interventions for Elderly Breast Cancer Patients with Cardiac Toxicity.
    3 weeks ago
    The CARDIOCARE project combines psycho-emotional and behavioral interventions into a mobile health (mHealth) application for the elderly breast cancer patients who have a risk for therapy-induced cardiac toxicity. The mHealth application includes psycho-emotional modules such as Expressive Writing, the ABCDE Model, and Best Possible Self alongside behavioral interventions like prompted voiding and pelvic floor exercises, which are focused on both improving psychological resilience and physical health. By focusing on both physical and psychological needs, the app aims to improve patient adherence to treatment and to alleviate healthcare burden. Preliminary results based on data analysis collected from 67 patients from six clinical centers show promising trends: 45% of patients initially expressed denial in their first entries using the ABCDE module, which later shifted towards acceptance and active coping strategies. These findings show the potential of the CARDIOCARE interventions to enhance the well-being in elderly cancer patients. Ongoing trials are expected to provide a more comprehensive understanding of these interventions and their impact on improving psychological well-being and overall quality of life of cancer patients.Clinical Relevance- This study investigates the potential of the CARDIOCARE mHealth application to address both psychological and physical needs in elderly cancer patients with therapy-induced cardiac toxicity. Preliminary results from six clinical centers indicate that the CARDIOCARE mHealth application can support elderly cancer patients by helping them to express their emotions, cope with their illness, and adopt healthy routines. These interventions could help clinicians enhance patient care by providing personalized support and remote monitoring, resulting in better quality of life outcomes.
    Cancer
    Cardiovascular diseases
    Access
    Care/Management
    Advocacy
  • CT Heterogeneity and Dose Distribution Patterns in Block and Ring Regions Improved the Prediction of Radiation Pneumonitis.
    3 weeks ago
    To investigate the CT heterogeneity and dose distribution pattern on the occurrence of radiation pneumonitis (RP), this study retrospectively analyzed 251 lung cancer patients. Based on dose values, each patient's CT and Dose images were divided into 8 block and 6 ring regions based on the intersection of specific CT structure and dose region with specific dose value. 1158 radiomics features were extracted from each modality, characterizing shape, density, voxel intensity, and texture features of each region. Dose-Volume Histogram (DVH) parameters were also calculated. Seven machine learning models were used to predict patients' RP status, including Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbors (K-NN), Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), and Categorical Boosting (CatBoost) models. The results showed that multi-modal dosimetry and radiomics features are more effective in prediction than DHV parameters. For the block regions, areas with dose values ≥ 30 Gy performed the best, with the highest AUC of 0.886, and for the ring regions, areas with dose values between 40~50 Gy achieved the highest AUC of 0.977. In summary, finely divided ring regions had better and more stable predictive performance than block regions, which provides a basis for personalized treatment plans and with the potential to improve treatment outcomes.Clinical Relevance- CT heterogeneity and dose distribution patterns in the ring region with 40~50 Gy are more relevant to the occurrence of RP, physicians should pay more attention to this region in treatment planning.
    Cancer
    Chronic respiratory disease
    Access
    Care/Management
    Advocacy
  • Conditional Generative Adversarial Network for Predicting the Aesthetic Outcomes of Breast Cancer Treatment.
    3 weeks ago
    The alterations to the visual appearance of patients' breasts that occur due to breast cancer locoregional treatment can impact the self-esteem and satisfaction of the patients, affecting quality-of-life after treatment. As such, it is imperative that the patients are adequately informed of the potential aesthetic outcomes of treatment, to facilitate the choice of treatment and promote realistic expectations. As breast asymmetries are among the most notable effects of treatment, we propose a conditional generative adversarial network for manipulating the breast shape in torso images, applying it to simulate how the breasts' shape may change through surgical interventions. Experiments on a private breast dataset suggest that the proposed model outperforms the state-of-the-art in the realistic reconstruction of the torso of the patient while effectively manipulating the breasts.Clinical relevance - The proposed model enables visualizing alterations to the shape of a patient's breasts caused by locoregional breast cancer treatment, aiding the patient in the choice of treatment plan when multiple options are available.
    Cancer
    Access
    Care/Management
  • Multi-modal data fusion for enhanced pancreatic cancer detection.
    3 weeks ago
    Pancreatic cancer is among the deadliest cancers globally, primarily due to diagnosis at advanced stages. While deep learning has significantly advanced computer-aided diagnosis (CADx), most models operate in a single-modality framework. The integration of complementary information from multiple data sources in the medical domain could enhance CADx performance. To explore this potential, this work investigates the effectiveness of different data-fusion strategies across different image encoders. More specifically, the paper evaluates three fusion methods: (1) data-level fusion, (2) decision-level fusion and (3) feature-level fusion. To construct a baseline, these methods are evaluated on a novel, multi-modal animal dataset, comprising 2D natural images accompanied by self-annotated attributes. The study is then extended to the medical domain using an internal pancreatic cancer dataset composed by 3D abdominal CT scans and corresponding clinical features. For both datasets, the integration of image and attribute features shows significant improvements in classification performance. Specifically, for the pancreas dataset, data-level and decision-level fusion consistently outperform one modality experiments. Our best performing model achieves an area under receiver operating characteristics curve (AUC) of 0.94±0.01, significantly surpassing both its image only (0.87±0.01) and attributes only (0.87±0.02) baselines. This study highlights the effectiveness of multi-modal CADx in the medical domain. Our comparison to a larger-scale natural image dataset underscores the potential for even greater improvements, compared to image-only and attribute-only approaches, through more advanced fusion methods as more data becomes available. To encourage further research, the code and the newly introduced animal dataset will be made publicly available in a repository accessible at: https: //github.com/ConstancaSilva07/Multi-modalData-Fusion-for-Pancreatic-Cancer-Detection.
    Cancer
    Access
    Care/Management
    Advocacy
  • ResNext based U-Net for segmenting sonomammogram.
    3 weeks ago
    Breast cancer detection through ultrasound imaging presents challenges due to variability in image quality and interpretation. This study introduces a novel ResNext-based U-Net architecture for the segmentation of sonomammograms, aiming to enhance accuracy and reliability. The proposed model integrates ResNext blocks within the U-Net framework, leveraging the residual connections to improve feature extraction and gradient propagation. We evaluated the model's performance across five-folds, comparing it with a baseline U-Net and with ResNet encoders. Our results indicate that the inclusion of ResNext blocks improves segmentation performance, particularly in capturing finer details and enhancing specificity. The enhanced architecture offers a promising tool for aiding radiologists in the early detection and diagnosis of breast cancer, providing a reliable and accurate method for automated sonomammogram segmentation.
    Cancer
    Access
    Advocacy
  • Utilizing synthetic data for privacy-preserving AI modeling in radiomics: a case study.
    3 weeks ago
    Preserving the privacy of AI models in healthcare is critical due to the sensitive nature of patient data, particularly in radiomics which provide a unique "signature" that can potentially re-identify individuals. This study explores synthetic radiomics data as a privacy-preserving strategy for AI-driven prostate cancer aggressiveness classification. Radiomics were extracted from 4,588 retrospective and 1,369 prospective Multiparametric MRI (mpMRI) data across 12 EU centers. Three advanced generators were explored, including: (i) a custom version of the Bayesian Gaussian Mixture Models with optimal components estimation (i.e., the BGMMOCE), (ii) the Conditional Tabular Generative Adversarial Network (CTGAN), and (iii) the Tabular Variational AutoEncoder (TVAE). Data fidelity was assessed using statistical measures like the Jensen-Shannon divergence (JSD), and the Hellinger distance (HD). Based on our findings, the BGMMOCE achieved increased fidelity (e.g. JSD 0.08, HD 0.23) compared to the rest. A Random Forest (RF) classifier trained on the BGMMOCE-generated data, and tested on real prospective cases, showed comparable performance with the AI model trained only on the real data, highlighting the balance between fidelity and privacy.Clinical Relevance-This work addresses the challenge of preserving patient privacy in AI-driven radiomics analysis by leveraging synthetic data while maintaining model performance.
    Cancer
    Access
    Care/Management
    Advocacy
  • AI-Driven Pathomics for Predicting Chemotherapy Response in Metastatic Colorectal Cancer: A Transfer Learning Approach with Attention-Based Multiple Instance Learning.
    3 weeks ago
    Predicting chemotherapy response in metastatic colorectal cancer (mCRC) remains challenging due to the lack of reliable biomarkers. We developed an artificial intelligence (AI)-driven pathomics approach that integrates multicenter data and provides visual explanations to enhance interpretability. Using transfer learning and attention-based multiple instance learning (MIL), our deep learning algorithm automatically extracted predictive features from whole slide images (WSIs). The model was pretrained on The Cancer Genome Atlas (TCGA) to identify survival-related histopathological patterns and fine-tuned on a multicenter mCRC cohort to classify patients as chemotherapy-sensitive or resistant.Compared to a baseline model trained solely on mCRC data, our approach improved model stability, reduced variability, and increased predictive accuracy in the holdout test set achieving a nearly 15% improvement in area under the curve (AUC: 0.54 to 0.68). High attention-weighted tile visualizations highlighted tumor microenvironment features potentially linked to chemotherapy resistance, offering insights for both clinical decision-making and biological research.Clinical relevance- This study demonstrates the feasibility of using AI and digital pathology, through transfer learning, to enhance chemotherapy response prediction in mCRC. By improving interpretability and incorporating multicenter data, our approach offers insights that could inform treatment strategies. Although further refinement is needed for clinical deployment, this framework lays the foundation for integrating multi-omics data to achieve AI-powered personalized oncology care.
    Cancer
    Access
    Care/Management
  • Novel Multimodal Breast Phantoms vs. Real: Even Experts Can't Tell the Difference.
    3 weeks ago
    Current research on rapid breast cancer screening emphasizes the importance of hybrid sensing methods, such as Tactile Imaging and Ultrasound. However, the lack of phantoms designed to support both modalities simultaneously hinders their development. This study evaluates the realism and imaging performance of a novel multi-material breast phantom through expert visual assessment. Ultrasound and tactile imaging experts analyzed 16 images-both cadaveric and phantom-identifying them as real or synthetic, rating confidence levels, and noting distinguishing features. A confusion matrix was used to calculate the true positive rate (TPR), which reflected the correct identification of cadaveric images, and the false positive rate (FPR), which indicated instances where a phantom was misclassified as real. The mean TPR values were 0.46 for ultrasound and 0.4 for tactile imaging, highlighting the challenge of distinguishing between the two. Experts identified key features, such as noise, smoothness, and structural irregularities, but often struggled to confidently differentiate real from synthetic tissue.Clinical Relevance- This study introduces visual observation as a valuable evaluation method, expanding training phantom development beyond material property measurements. The phantom's realism supports testing, training, and optimization of hybrid imaging devices and imaging protocols while facilitating further research on combined ultrasound and tactile imaging techniques for clinical integration.
    Cancer
    Access
    Care/Management
  • Segmentation Variability in Bayesian U-Net versus Manual Annotations: Impact on Radiomic Reproducibility in Lung Tumor CT Images.
    3 weeks ago
    Radiomic analysis is highly sensitive to variations in Region of Interest (ROI) segmentation. Automatic segmentation methods based on Deep Learning (DL) can enhance radiomic reproducibility due to their high accuracy. Moreover, incorporating uncertainty quantification into these approaches can increase the trustworthiness of DL predictions, particularly when this uncertainty well captures the variability observed in expert annotations. This study aims to evaluate whether the uncertainty quantified by DL-based segmentation aligns with expert variability, and to identify the optimal configuration for maximizing radiomic features reproducibility. To this end, the Monte Carlo Dropout (MCD) approach was integrated into a U-Net model to segment lung tumors on CT images from two publicly available datasets. Tumor masks manually delineated by multiple experts were compared with masks predicted by MCD-based inferences at various confidence levels. Radiomic features were extracted from each segmentation, and reproducibility was assessed across combinations of confidence thresholds. The results indicate that the MCD approach can produce segmentations that partially reflect the variability observed in expert annotations, particularly at lower confidence thresholds. Also, radiomics remained highly sensitive to segmentation variability, with only about half of the features achieving reproducibility under the best conditions.Clinical relevance- This study supports the introduction and adoption of DL segmentation approaches and radiomic analysis in clinical practice, by increasing trustworthiness on their prediction, compared to manual delineation.
    Cancer
    Chronic respiratory disease
    Access
    Care/Management
    Advocacy