Immunohistochemical information integrated pre-training improves HER2 status prediction from whole slide images of breast cancer.
The evaluation of Human Epidermal Growth Factor Receptor 2 (HER2) expression is of paramount importance for the precise treatment of breast cancer. Immunohistochemistry (IHC) is the established gold standard for HER2 assessment, but it does come with substantial costs. To alleviate the requirement for costly IHC tests, this paper develops a groundbreaking patch-level feature encoder called HI-MAE, for the first time, that makes use of hematoxylin and eosin (H&E) and IHC information in pre-training masked autoencoder (MAE) for more effective feature encoding in H&E stained images. The HI-MAE integrated with various multiple instance learning (MIL) models enables the direct prediction of HER2 status from H&E stained whole slide images (WSIs). Through evaluation on the TCGA-BRCA dataset, we demonstrate a significant improvement in HER2 status prediction by integrating IHC information into the feature encoding for H&E stained images, with the AUC increasing from 0.59 to 0.74. Our HI-MAE model opens new avenues for future research, facilitating the incorporation of IHC information into other classification tasks, particularly for precise biomarker predictions.Clinical relevance- By incorporating IHC images into the pre-training process, we enable the prediction of HER2 status directly from routine H&E-stained tissue sections under the guidance of IHC information, thereby eliminating the need for IHC images during the testing phase. This approach significantly reduces the staining costs associated with IHC.