Allele-specific genomics decodes gene targets and mechanisms of the non-coding genome.
A large proportion of disease variants is found in non-coding RNAs (ncRNAs), gene loci that have been identified as key regulatory elements. However, for most ncRNAs, their targets are unknown, hindering our ability to understand complex diseases. Here, we found that allele-specific ncRNAs were enriched nearby allelic protein-coding genes (pcGenes), suggesting that the allele-specific information could be used to predict cis-acting ncRNA-targets. We translated this concept into the Allelome.LINK framework and applied it to the major mouse organs, revealing 397 events where the allele-specific expression (ASE) of a ncRNA correlated or anticorrelated with the ASE of a nearby pcGene, suggesting either enhancing or repressive regulatory interactions. Integration of H3K27ac heart ChIP-seq enabled the linkage of putative allelic enhancers to allele-specific gene loci and provided insight into ncRNA- versus DNA-mediated regulatory effects. Next, we applied our strategy to the largest human dataset including tissues from nearly 1000 individuals. Given the high genetic diversity across humans, each individual allows for the discovery of novel ASE correlation events. We uncovered 2291 ncRNA-mRNA ASE events along with their mechanisms, which we benchmarked against sample-matched eQTLs, yielding a high validation rate of 77.47%. Further GWAS integration assigned variants overlapping informative ncRNA to their pcGene targets. As more sequencing data and risk variants become available, this strategy has the potential to decode the entire cis-acting landscape of the non-coding genome.
Authors
Hasenbein Hasenbein, Hoelzl Hoelzl, Engelhardt Engelhardt, Andergassen Andergassen
View on Pubmed