AutoEpiCollect 2.0: A Web-Based Machine Learning Tool for Personalized Peptide Cancer Vaccine Design.
Personalized cancer vaccines are a key strategy for training the immune system to recognize and respond to tumor-specific antigens. Our earlier software release, AutoEpiCollect 1.0, was designed to accelerate the vaccine design process, but the identification of tumor-specific genetic variants remains a manual process and is highly burdensome. In this study, we introduce AutoEpiCollect 2.0, an improved version with integrated genetic analysis capabilities that automate the identification and prioritization of tumorigenic variants from individual tumor samples. AutoEpiCollect 2.0 connects with RNA sequencing and cross-references the resulting RNAseq data for efficient determination of cancer-specific and prognostic gene variants. Using AutoEpiCollect 2.0, we conducted two case studies to design personalized peptide vaccines for two distinct cancer types: cervical squamous cell carcinoma and breast carcinoma. Case 1 analyzed five cervical tumor samples from different stages, ranging from CIN1 to cervical cancer stage IIB. CIN3 was selected for detailed analysis due to its pre-invasive status and clinical relevance, as it is the earliest stage where patients typically present symptoms. Case 2 examined five breast tumor samples, including HER2-negative, ER-positive, PR-positive, and triple-negative subtypes. In three of these breast samples, the same epitope was identified and was synthesized by identical gene variants. This finding suggests the presence of shared antigenic targets across subtypes. We identified the top MHC class I and class II epitopes for both cancer types. In cervical carcinoma, the most immunogenic epitopes were found in proteins expressed by HSPG2 and MUC5AC. In breast carcinoma, epitopes with the highest potential were derived from proteins expressed by BRCA2 and AHNAK2. These epitopes were further validated through pMHC-TCR modeling analysis. Despite differences in cancer type and tumor subtype, both case studies successfully identified high-potential epitopes suitable for personalized vaccine design. The integration of AutoEpiCollect 2.0 streamlined the variant analysis workflow and reduced the time required to identify key tumor antigens. This study demonstrates the value of automated data integration in genomic analysis for cancer vaccine development. Furthermore, by applying RNAseq in a standardized workflow, the approach enables both patient-specific and population-level vaccine design, based on statistically frequent gene variants observed across tumor datasets. AutoEpiCollect 2.0 is freely available as a website based tool for user to design vaccine.
Authors
Kim Kim, Shelton Shelton, Samudrala Samudrala, Savsani Savsani, Dakshanamurthy Dakshanamurthy
View on Pubmed