Empirically determined baseline masking strategies and other considerations for gene-level burden tests.

Rare-variant association studies typically perform gene-level tests in which coding variants are filtered (or 'masked') and aggregated based on functional annotation and allele frequency. Through a systematic literature review, we cataloged 664 masks used across 234 studies and found that masking strategies (that is, sets of masks) rarely repeat across studies and are rarely justified. To quantify their impact on association results, we applied all previously employed strategies to 54 traits within 189,947 UK Biobank exomes. Here we find that the number of significant associations greatly depends on the masking strategy (ranging from 58 to 2,523 associations), which is a key reason for the modest overlap (<30%) of associations between separate published analyses of this dataset. We empirically determine masking strategies with high discovery power for low-frequency and rare variant gene-level associations across numerous datasets and traits, and we use these to explore the impact of other factors on burden test results. These findings offer a baseline strategy in burden tests to increase study power and replicability, addressing one source of inconsistency in previous studies.
Cardiovascular diseases
Care/Management

Authors

Nguyen Nguyen, Koesterer Koesterer, Haydarlou Haydarlou, Dornbos Dornbos, Yoshiji Yoshiji, Llamas Llamas, Jang Jang, Smadbeck Smadbeck, Moriondo Moriondo, Hoang Hoang, Ruebenacker Ruebenacker, Bezzina Bezzina, Ellinor Ellinor, Jurgens Jurgens, Burtt Burtt, Flannick Flannick
View on Pubmed
Share
Facebook
X (Twitter)
Bluesky
Linkedin
Copy to clipboard