Is There Only One Analysis?
Received Date: February 25, 2020; Published Date: March 02, 2020
There has been much controversy about the role of p<0.05 - the traditional requirement that the attained significance level from an experiment yield this result in order for the outcome to be considered evidence to reject the null hypothesis. The p < 0.05 controversy is most acute in clinical research because the regulatory agencies (e.g. U.S. Food and Drug Administration, European Medicines Agency) have traditionally insisted that sponsors (pharmaceutical company, medical device company) submit a frequentist intent-to-treat (analyze all patients as randomized regardless of how they are actually treated post treatment start) efficacy analysis of confirmatory clinical trials (usually Phase III) with the requirement that Type I error for the primary efficacy endpoint treatment comparison be less than 0.05 for marketing approval. We can refer to this analysis as the regulatory analysis because regulatory agencies find this useful in their work. Most often for a given medical product, we will generalize with the term “experimental treatment” herein, the regulatory analysis is the only one that will ever be reported in a journal and this is why many physicians, biostatisticians and other sciences are mounting a pushback for the p < 0.05 requirement.
Other analyses related to the same experimental treatment can be produced for audiences other than regulators. [Table 1] presents a list of analyses for four additional audiences sponsor internal, practicing physician, patients and third-party payers (health insurance companies, national health insurance). Each audience has their own objectives and the differing objectives call for different types of analyses and cannot be driven by p < 0.05. The regulatory analysis emphasizes hypothesis testing while the other analyses rely mostly on estimation.
Table 1: Potential Efficacy Analyses for An Experimental Treatment.
Sponsors will do additional analyses to gain a better understanding of the experimental treatment especially which groups of patients respond to treatment better than others and which are most vulnerable to adverse events. While sponsors often use the intent-to-treat analysis there will often be a need to perform a per-protocol analysis (analyze only patients who followed the clinical trial protocol, received the maximum number of doses, etc.). Methods of causal inference such as propensity scores  might be used to make treatment comparisons more useful and, although p-values might be calculated, sponsors might find the Bayes factor  or the evidential likelihood support level  to be helpful.
Practicing physicians might have the objective of precision medicine and might use the confirmatory clinical trials along with a meta-analysis of data from clinical trials of similar treatments and related insurance claims data. They might prefer a Bayesian per protocol design and refer to the posterior odds to quantify evidence.
Patients will likely be interested in a risk (adverse events) vs benefit (efficacy) analysis. Insurance claims databases might be useful here, per protocol analysis and statistical decision theory. Evidence might be summarized by an inequality in quality adjusted life years . Similarly, the third-party payers (health insurance companies, health maintenance organizations) would have an interest in cost-effectiveness and might prefer to summarize evidence with a cost per QALY.
The point of this presentation is that all these analyses should be published. Perhaps certain journals will prefer to publish only the regulatory analysis and some only the practicing physician analysis, etc. However the proliferation of analyses of these types will move biostatistics and clinical research away from p< 0.05 as the sole summary of evidence and the differing audiences might learn from one another e.g. perhaps regulators will learn from the third party payers and the sponsor internal analysis might borrow ideas from the patient analysis.
There will, of course, be a need for frequent interdisciplinary seminars and changes in graduate school curriculum to support this disruption in dissemination of evidence. Sponsors typically perform many Bayesian analyses incorporating real-world data for their internal use in order to get a better understanding of their experimental treatments. This may be taken as an early signal that different analyses for different interest groups will be routinely published in the future.
Conflict of Interest
No conflict of interest.
- Rubin DB (1997) Estimating Causal Effects from Large Data Sets Using Propensity Scores. Annals of Internal Medicine 127(8 Pt 2): 757-763.
- Kass RE, Raftery AE (1995) Bayes Factors. Journal of the American Statistical Association 90: 773-795.
- Royall RM (1997) Statistical Evidence: A Likelihood Paradigm. In Chapman, Hall (Eds.), London.
- Weinstein MC, Torrance G, McGuire A (2009) QALYs: The Basics. Value in Health 12(Supply 1): S5–S9.