Towards Intelligent Agents for Radiotherapy: Integrating Exploration-Exploitation with Foundation Models.
This study proposes an automated approach to radiotherapy treatment planning by integrating a reinforcement-learning-style iterative framework with a multimodal Large Language Model (LLM). We specifically investigate the problem of Beam Angle Optimization, a high-dimensional and non-convex subproblem of Treatment Planning. Our system employs GPT-4V to select candidate beam angles and analyze three-dimensional dose distributions generated by Monte Carlo simulations within the MatRAD environment. Iterative plan refinement is guided by a reward function that encourages target dose conformity and penalizes excessive dose to organs at risk. We incorporate exploration-exploitation principles to strike a balance between investigating diverse action proposals and refining promising solutions. Experimental results on prostate cancer cases demonstrate that our LLM-based framework offers superior performance compared to random beam selection and can outperform the quality of deep reinforcement learning baselines, indicating the potential for LLMs to assist in complex radiotherapy treatment planning tasks.Clinical relevance-This approach is designed to alleviate the significant effort of manual treatment planning by assisting medical physicists in exploring beam configurations and systematically refining plans to improve dose coverage and protect healthy tissues.