Adaptive Design and the Estimand Framework
Received Date: February 22, 2019; Published Date: March 13, 2019
Adaptive designs allow prospectively planned modifications of a study without undermining the integrity and validity of the study. The draft ICH E9 addendum recommends the estimand framework in design, conduct, analysis, and interpretation of clinical trials for assessment of effectiveness of therapies. Herein, we discuss the possible impact and scientific implications when incorporating the estimand framework to adaptively designed clinical trials. It is hoped that this will elucidate how this framework provides a language for discussing relevant questions, related to the attributes of the estimand, that may arise from study adaptations.
Keywords: Enrichment design; Biomarker; Adaptive design; Intercurrent events; Estimand; Sample size reassessment; Group sequential design
Adaptive designs allow prospectively planned modification of a study without undermining the integrity and validity of the study . Adaptive designs are appealing for many reasons. They allow efficient identification of therapeutic benefits. They increase probability of success in clinical trials and allow informed flexibility in the conduct of clinical trials. Adaptive design reflects ethical practice in assessing efficacy and safety of therapeutic agents . However, the implementation of these types of designs is not without challenges. For instance, a considerable care is needed to ensure that adaptations do not inflate the overall type I error rate or endanger the interpretability of the trial results.
The draft ICH E9 addendum  recommends a disciplined approach for ensuring alignment among clinical trial protocol objective, trial conduct, statistical analyses, and the reported results. Its addendum discusses sensitivity analysis, the impact of intercurrent events in the interpretation of estimated intervention effects, and the estimand framework. The estimand framework tasks trialists to define the targeted quantity of statistical inference a priori and ensure that the scientific objective is aligned with the research question, study design, data to be collected, and conduct of the trial, while considering the impact of intercurrent events on inference . The framework, in some sense, provides flexibility to the sponsors in selecting the treatment effects of interest and in ensuring that all stakeholders are aligned on this targeted treatment effects upfront.
The estimand is the unobserved quantity from a population that is the target of statistical inference. It is the population parameter that encompasses the objective of the clinical trial. This unobserved population parameter is based on patients indexed by the inclusion/ exclusion criteria and encompasses those randomized. It is defined through the population, endpoint, intervention effects, and summary measure. The chosen estimator provides an estimate of this unobserved quantity (estimand). Consider a two-stage adaptive design with one interim analysis and the final analysis. If, for example in the final analysis of the adaptive design, one uses as an estimator the observed difference in proportions cured between the treatment and control groups without considering the difference in patient characteristics in the population at these two (interim and final) stages, the resulting estimate may not reflect the estimand. This is because the estimate of the unobserved population quantity is obtained from a population that does not reflect the patients randomized. Similarly, if study adaptation results in change in the scientific aspects of the study, the research question say, the resulting estimand will have interpretational difficulty as the adaptation may have resulted to a different trial and hence a different estimand than the study initially targeted. In other words, the original study may have been abandoned.
We discuss the possible impact of study adaptations on the attributes of the estimand. It is hoped that examining trial adaptations in the context of the estimand framework will elucidate how this framework provides a language for discussing relevant questions, related to the attributes of the estimand, that may arise from study adaptations.
Adaptations that do not Impact the Estimand
Sample size re-estimation
Sample size is one of the critical design features of a clinical trial. Sample size of a trial is determined by the type I error, type II error, meaningful minimum detectable difference, and the variance of the primary efficacy endpoint. During the planning of clinical trials, variability of the primary endpoint and the minimum detectable difference are rarely fully known. While a speculative minimum detectable difference may be chosen based on experts’ opinion or the treatment effects of the standard of care, complete information on the actual variability of the primary endpoint is often lacking because the drug under development might be a new class of therapeutic agents or may lack a well characterized natural history . Incorrect selection of any of these parameters impacts power of the study, and an inadequately powered study could lead to inability to detect a meaningful treatment difference and to a study with inconclusive results.
A sample size re-estimation (SSR) design [4,5] allows the study to begin with the best preliminary estimate of the sample size and allows the sample size needed to obtain the desired power to be reassessed with accumulated interim data. Of note is that no other feature of the study is affected. SSR design can adapt the sample size without compromising any of the attributes of the estimand. This is because if done properly, SSR alone does not alter the scientific aspects of the trial; hence, the estimand and its attributes are not impacted. Sample size re-estimation can be conducted in a blinded or unblinded fashion. Blinded sample size re-estimation plan introduces no bias, as under normal theory, the estimators of variance and mean are stochastically independent; therefore, the knowledge of the variance estimate as data accumulates provides no information about mean differences between treatment groups, neither does blinded sample size re-estimation plan invoke a Type I error penalty .
A case study: Sample size re-estimation in bioequivalence trial: The second author provided biostatistical support for a bioequivalence trial of six formulations in parallel (, pg. 204) when he worked at Burroughs-Welcome Pharmaceutical Company in Research Triangle Park, NC in 1978. An earlier 2x2x2 crossover trial of two of the formulations did not demonstrate bioequivalence due to a treatment by period interaction, necessitating the use of data from only the first period. Since the variance estimate for sample size estimation for the parallel, six formulation trial was derived from the first period data from the referenced twosequence, two-period crossover trial with a small number (8) of subjects per formulation, the development team was not very confident in the sample size estimate for the six-formulation trial. Therefore, a sample size re-estimation plan was included in the six-formulation trial. Twenty-four subjects were to be entered on each of the 5 consecutive weekends. The data management department developed a computer program which ’kicked out’ in blinded fashion the variance estimate based upon the analysis model after the data (area under the curve) were available from the subjects entered on each weekend (before the next weekend). After the second weekend (48 subjects, 8 per group) the variance estimate used for sample size estimation was greater than the variance estimate based upon the 48 subjects. It was thus concluded that the sample size estimate of 120 was adequate. We note that the trial was conducted long before there was much in the literature about sample size re-estimation or adaptive designs in general. See Peace and Chen  for details about the bioequivalence trial.
Classical group sequential design
The most common adaptive design is the classical group sequential design (GSD) . Group sequential trials are adaptive designs that proceed by prospectively defining the criteria for early trial termination for efficacy or futility based on the results of the interim analyses . GSD could potentially reduce the expected sample size and accelerate the approval of new drugs that demonstrated effectiveness. The classical GSD does not impact features of the study or the scientific aspects of the research question; hence, it does not impact any of the attributes of the estimand . Procedures for type I error control when using GSD are well established.
Adaptations that May Impact the Estimand
Adaptation to the study population
In some population enrichment designs, it is possible to make modification to the trial population such that interim analyses population may be different from the final analyses population. This occurs when a differential treatment effect is expected in a subset of the trial population defined by some patient characteristics. Modification to the study population may involve modification to the enrolled study population and the primary analysis population as well as testing hypothesis in multiple populations . Depending on the study design features, such modifications may or may not impact the estimand.
Example 1: biomarker stratify strategy: In the development of a therapeutic agent, an underlying belief, supported by biological, pre-clinical, or retrospective clinical evidence may suggest that the agent under development will likely be more effective in a subgroup of patients than others. Such an information could be used to classify patient into biomarker subgroups with the aim of maximizing the power to detect treatment effects if interim data supports the underlying assumptions. In the biomarker stratify design strategy, an all comers (who satisfy entry criteria) patient population is stratified into biomarker positive and biomarker negative, and randomization is performed within strata. The fact that one stratifies implies that one expects biomarker positive patients to respond differently from biomarker negative patients. Based on interim analyses results, a decision is made whether or not enrollment should continue in the enriched biomarker positive population or in the overall population. In the absence of a treatment by strata interaction, a reasonable common practice is to combine the strata and compare the drug to the placebo in a meta-analytic approach. In such an instance, the patient population differs in that one is a subset (biomarker positive) of the other (combined). However, such a strategy does not appear to present any issue with estimand framework. In essence, under this biomarker strategy, statistical analysis in either the subset (biomarker positive) or the combined population would evoke no statistical issue from the estimand framework perspective. This is because the estimate of the unobserved population quantity is obtained from a population indexed by the patients randomized and stratified randomization allows a valid answer to the scientific question within strata or in the combined patient population.
Adaptation that Impacts the Estimand
Adaptations to scientific aspects of the study
Some adaptive designs are subject to more potentially complex adaptations than others. These complex adaptations include, but not limited to, treatments, endpoints, molecular subtypes. Adaptations to some of these aspects of the study may lead to abandonment of the initial trial in that the resulting trial is no longer able to address the intended scientific question due to transition of estimand. Because adaptations to scientific aspects of the study reflect important clinical and scientific considerations, early interactions with the regulatory agency is recommended .
Example 2: Change in trials endpoint: In rheumatoid arthritis clinical trials, common criteria for assessing effectiveness of treatment are the American College of Rheumatology (ACR) criteria and the disease activity score over 28 joints (DAS28). The ACR criteria is reported as a percentage improvement by a specified threshold from one time-point to another. ACR20 (20% improvement) is a binary composite endpoint. The DAS28, on the other hand, is a continuous composite endpoint with components that are not necessarily same as those of the ACR20. The direction of measure of effects of these two endpoints are opposite in that ACR20 measures improvement following therapy–implying larger effects are desirable–whereas DAS28 is a symptom score with smaller mean score being desirable.
In the classical group sequential design, the endpoint and the test statistics used for interim analysis are same as those used for the final analysis. All other things been equal, there are no adaptations to the scientific aspects of the trial. For a K-stage group sequential design, Wang et al.  propose group sequential tests with change of endpoints and test statistics; the proposed design uses DAS28 in the first K-1 interims and ACR20 in the final stage. This implies modification to research question and hence many components (variable, summary measure, and handling of intercurrent event) of the estimand. Admittedly, many proponents of adaptive design believe that it is permissible to change important scientific aspects of the study during trial conduct based on information obtained from the adaptive process. See for example . Nonetheless, juxtaposing the estimand framework with adaptive design espousing modification to the scientific aspects of a study clarifies that such a practice endangers interpretability of the study results  arising from change in estimand.
Comments and Concluding Remarks
We have discussed some adaptive designs in the context of the estimand framework. The adaptive designs covered in this brief note are those likely to arise in confirmatory trial settings. This is because confirmatory trials are not supposed to be exploratory; rather, confirmatory trials are designed to confirm observed findings in the exploratory stage of the drug development program. The consideration for the choice of adaptations in the context of the estimand framework will depend on whether or not the study is an exploratory or a confirmatory trial. Because adaption to scientific aspects of the study reflects important clinical and scientific considerations, early interactions with the regulatory agency are encouraged. Indeed, the estimand framework provides a language for discussing relevant questions related to the attributes of the estimand that may arise from study adaptations.
Conflict of Interest
No conflict of interest.
- Chow SC, Liu JP (2014) Design and Analysis of Clinical Trials: Concepts and Methodologies 3rd edn, New Jersey: John WIlley and Sons Inc, USA.
- FDA (2017) E9(R1) statistical principles for clinical trials: Addendum: Estimands and sensitivity analysis in clinical trials, International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use, Tech Rep.
- K Peace, A Parrillo, C Hardy (2008) Validity of inference or public health research. Journal of The Georgia Public Health Association 1(1): 10-23.
- L Gould (1995) Planning and revising the sample size for a trial. Statistics in Medicine 14(9-10): 1039-1051.
- Stein (1945) A two-sample test for a linear hypothesis whose power is independent of the variance. Annal of Mathematical Statistics 16: 243- 258.
- KE Peace, DGD Chen (2010) Clinical Trial Methodology. Chapman and Hall/CRC.
- M Chang (2015) Introductory Adaptive Trial Designs: A Practical Guide with R. Chapman and Hall/CRC.
- FDA (2018) Adaptive Designs for Clinical Trials of Drugs and Biologics Guidance for Industry Adaptive De-signs for Clinical Trials of Drugs and Biologics Guidance for Industry, Food and Drug Administration, Rockville, Maryland.
- L Wang, Q Li, Z Li, A Kaur (2017) Group sequential design with change of endpoint. Statistics in Biopharmaceutical Research 9(4): 338-346.
- J Wittes (2002) On changing a long-term clinical trial midstream. Statistics in Medicine 27: 2789-2795.
- EMEA (2007) Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design, The European Agency for Evaluation of Medicinal Products Evaluation of Medicine for Human Use, London, UK, Tech. Rep. CHMP/EWP/2459/02, 2007.