Transcriptomic analysis in autism spectrum disorder suggests three molecular subtypes with distinct phenotypic profiles and functional pathways.
Autism spectrum disorder (ASD) is highly heterogeneous. Molecular subtyping of ASD based on gene expression data will help address the biological heterogeneity of ASD. We analyze RNA-seq data from 1711 ASD samples, employing unsupervised non-negative matrix factorization (NMF) to identify subtypes, and characterize them with five categories: overall ASD symptoms, social-communication deficits, restricted-repetitive-stereotyped behaviors, cognitive-adaptive function and behavior problems. We also explore subtype-specific differentially expressed genes, functional pathways, and spatiotemporal/cell-type characteristics. NMF analysis identifies three ASD subtypes. The three subtypes exhibit different phenotypic patterns: Cluster 1 has severe restricted and repetitive behaviors, Cluster 3 mainly has social communication impairments, Cluster 2 presents milder symptoms and better cognitive function. Differential expressed genes analysis links Cluster 1 and Cluster3 to nervous system dysfunctions and morphogenesis of a branching structure, while Cluster 2 to immune system processes. The clusters also exhibit distinct gene expression patterns across brain regions, developmental stages, and cell types. Validation in two independent datasets confirms the reproducibility of the distinct clusters. This molecular subtyping reveals biologically and clinically distinct subtypes of ASD, helping to understand the atypical mechanisms underlying ASD heterogeneity. It also contributes to the development of personalized therapeutic strategies for distinct subtypes.