Structural Guidance in Stacked Generative Diffusion Model: Synthesizing Head and Neck CT from MRI in Radiotherapy Planning.
Head and neck radiotherapy often combines a patient's MRI, showing soft tissue contrast, with a pre-treatment CT allowing for dosimetry planning. Synthesizing missing CT data from available MR images minimizes radiation exposure, and facilitates adaptive re-planning. We propose a generative diffusion model that synthesizes CT images of tumors from available MR modalities, incorporating structural guidance within a stacked diffusion framework. The model utilizes two stacked denoising diffusion probabilistic models (DDPMs). The first is a structure image generator, producing structural representations of CT images from the corresponding MRI inputs. These representations are then utilized by a second contextual image DDPM, which leverages both the original MRI and the generated structural representations as an augmented multi-channel input to improve the synthesis of the CT images. Our training employs a variational inference approach that combines a lower variational bound loss with a mean absolute error loss, leveraging both structural and contextual features. Evaluated on the Head and Neck Organ-at-Risk Multi-Modal dataset (HaN-Seg), our model outperforms recent MR-to-CT generative diffusion models, achieving a multiscale structure similarity index (multiscale-SSIM) of 0.85 ± 0.08, a mean absolute error (MAE) of 0.09 ± 0.06, and a peak signal-to-noise ratio (PSNR) of 22.05 ± 1.83. Additionally, the model achieved the highest probability rand index (PRI) score of 0.83 ± 0.04 with a Dice score of 0.75 ± 0.07, and a global consistency error (GCE) of 0.16 ± 0.05 on segmented tumor area of synthetic sCT images.