Interpreting Treatment Effects Using Posterior Probabilities: A Bayesian Reanalysis of 230 Phase III Oncology Trials.
Most oncology trials define superiority according to dichotomized P value thresholds, which are frequently misinterpreted. Posterior probability, however, directly estimates the probability of the hypothesis at hand. Here, we reanalyze a large collection of modern phase III trials and benchmark posterior probability versus the standard trial interpretation based on statistical significance.
Outcomes from 194,129 patients were manually reconstructed from the primary end points of 230 phase III, superiority-design oncology trials. Posterior probabilities of treatment effect were then calculated across multiple priors and several effect sizes of clinical relevance, including minimum clinically important difference (MCID) defined as hazard ratio (HR) < 0.8 per ASCO criteria or HR < 0.64 per European Society of Medical Oncology (ESMO) criteria.
All trials interpreted as superior using P value thresholds had probabilities >90% for achieving at least marginal benefits (HR < 1). However, only 62% of positive trials (74/120) had >90% probabilities of achieving the ASCO MCID (HR < 0.8), even under an enthusiastic prior, including 70% of trials (57/82) leading to regulatory approval. Only 30% of positive trials (36/120) had >90% probability of achieving the ESMO MCID (HR < 0.64). Conversely, 24% of trials (26/110) interpreted as not superior had >90% probability of achieving marginal benefits (HR < 1), even under a skeptical prior.
Bayesian models, although often in agreement with statistical significance thresholds, add considerable unique interpretative value for a subset of phase III oncology trials. Posterior probability may provide a solution for overcoming the discrepancies between refuting the null hypothesis and detecting clinically relevant effects.
Outcomes from 194,129 patients were manually reconstructed from the primary end points of 230 phase III, superiority-design oncology trials. Posterior probabilities of treatment effect were then calculated across multiple priors and several effect sizes of clinical relevance, including minimum clinically important difference (MCID) defined as hazard ratio (HR) < 0.8 per ASCO criteria or HR < 0.64 per European Society of Medical Oncology (ESMO) criteria.
All trials interpreted as superior using P value thresholds had probabilities >90% for achieving at least marginal benefits (HR < 1). However, only 62% of positive trials (74/120) had >90% probabilities of achieving the ASCO MCID (HR < 0.8), even under an enthusiastic prior, including 70% of trials (57/82) leading to regulatory approval. Only 30% of positive trials (36/120) had >90% probability of achieving the ESMO MCID (HR < 0.64). Conversely, 24% of trials (26/110) interpreted as not superior had >90% probability of achieving marginal benefits (HR < 1), even under a skeptical prior.
Bayesian models, although often in agreement with statistical significance thresholds, add considerable unique interpretative value for a subset of phase III oncology trials. Posterior probability may provide a solution for overcoming the discrepancies between refuting the null hypothesis and detecting clinically relevant effects.
Authors
Sherry Sherry, Msaouel Msaouel, Kupferman Kupferman, Lin Lin, Abi Jaoude Abi Jaoude, Kouzy Kouzy, El-Alam El-Alam, Patel Patel, Koong Koong, Lin Lin, Passy Passy, Miller Miller, Beck Beck, Fuller Fuller, Meirson Meirson, McCaw McCaw, Ludmir Ludmir
View on Pubmed