Research Article
Weighted Statistics for Testing Multiple Endpoints in Clinical Trials
Michael I Baron^{1}* and Laurel M MacMillan^{2}
^{1}American University, Washington DC, USA
^{2}Gryphon Scientific LLC, Takoma Park MD, USA
Michael I Baron, American University, Washington DC, USA.
Received Date: April 05, 2019; Published Date: May 02, 2019
Abstract
Bonferroni, Holm, and Holmtype stepwise approaches have been well developed for the simultaneous testing of multiple hypotheses in medical experiments. Methods exist for controlling familywise error rates at their preset levels. This article shows how performance of these tests can often be substantially improved by accounting for the relative difficulty of tests. Introducing suitably chosen weights optimizes the error spending between the multiple endpoints. Such an extension of classical testing schemes generally results in a smaller required sample size without sacrificing the familywise error rate and power.
Keywords: Error spending; Familywise error rate; Likelihood ratio test; Minimax; Stepwise testing
Introduction
Many clinical trials and other statistical experiments are conducted to test not one but many hypotheses. Often a decision has to be made on each individual null hypothesis instead of combining them into one composite statement. Most of the clinical trials of new medical treatments have to establish both their safety and efficacy, often involving multiple endpoints or multiple competing treatments [14]. For example, recent clinical trials of Prometa, a drug addiction treatment, included testing for multiple side effects as well as multiple criteria of effectiveness such as reduction of craving, improvement of cognitive functions, and frequency of drug abuse [5,6]. Studies of genetic association explore multiple genes and multiple single nucleotide polymorphisms, or SNPs [79].
It is still common in applied research to conduct multiple tests, each at a nominal 5% level of significance, and report only those results where significant effects were observed. Anderson [10] estimates that 84% of randomized evaluation articles in diverse fields test five or more outcomes, and 61% test ten or more, but they fail to adjust for multiple comparisons. Clearly, when each hypothesis is tested at a given level α , the probability of committing a Type I error and reporting at least one significant effect is much higher than α even when no effects exist in the population and all the null hypotheses are true.
For this reason, a number of methods for multiple comparisons have been developed to control a familywise error rate (FWER) which is the probability of rejecting at least one true null hypothesis, see [1114] for the overview. The Bonferroni approach, due to its simplicity, arguably remains the most commonly used method of multiple testing. Each individual hypothesis is tested at a significance level α_{j}, guaranteeing that FWER ≤α as long as Σα_{j}≤α . However, the underlying Bonferroni (Boole) inequality is not sharp, leaving room for improvement
Enhancing the Bonferroni method, Holm [15] proposed a scheme based on the ordered pvalues. Developing upon Holm’s idea, stepup and stepdown methods for multiple testing have been developed for nonsequential [11,1619] and most recently, sequential experiments [2023]. These Holmtype methods (also called stepwise for testing marginal hypotheses in the order of their significance) allow to use higher levels of α_{j} leading to increased power, while still controlling FWER.
These stepwise methods and most of the other approaches to multiple tests do not account for different levels of difficulty of the participating tests, or proximity between null hypotheses and their corresponding alternative hypotheses. Why should we take this into account when designing statistical experiments?
Example. As a simple example, consider simultaneous testing of three endpoints in a clinical trial, where the null parameter differs from the alternative parameter by 0.35 standard deviations in the first test, by 0.30 standard deviations in the second test, and by 0.25 standard deviations in the third test. What sample size suffices for controlling FWERs at α = 0.05and β = 0.10 , assuming normal measurements with known standard deviations?
Following the standard Bonferroni approach, we conduct each test atα_{j}= α/ 3 _{j} and β_{j}= β / 3 , and this requires 129 observations to conduct the first test, 175 for the second test, and 252 for the third test, computed by the formula , where δ_{j}is the distance between the null and alternative parameters measured in respective standard deviations. It is not surprising that the easiest test #1 (because it is easier to detect a larger difference between the null and the alternative hypotheses) requires the smallest sample size. Imagine, however, that all three data sequences are obtained from the same sampling units such as patients each answering three questions in their questionnaire or measuring concentrations of three substances in their blood samples. Then we still need to sample all 252 patients to guarantee the FWER control!
Since three tests had differing levels of difficulty, the uniform error spending (α / 3,α / 3,α / 3) and (β / 3,β / 3,β / 3) was not optimal. As it is shown in Theorem 3.1 of De and Baron [24], the asymptotically most difficult test should optimally receive almost the entire allowed error probability, in the extreme case under the Pitman alternative. We do not have a limiting case here, but there is ample room for improvement. A better error spending is (α_{1} = 0.006,α_{2} = 0.014,α_{3} = 0.030) and (β_{1} = 0.011,β_{1} = 0.028,β_{1} = 0.0610) . In this case, a sample of 189 patients instead of 252 is sufficient, for a pure 25% saving, using the (generalized) Bonferroni procedure.
Stepwise procedures have a potential to increase this saving even further. However, the Holm method does not distinguish between “easy” and “difficult” tests. The Holmadjusted levels of significance areα / d,α / (d −1),....,α , regardless of the tested null and alternative parameter values and their proximity. In this paper, we generalize the Holm method to allow higher than Bonferroni significance levels and, at the same time, to account for the difficulty levels, which results in reduced required sample sizes.
The key in this optimization is minimaxity of the optimal error spending. Indeed, the sample size is determined by the test that requires the largest number of patients, because we need enough data to reach decision for each individual hypothesis. Minimizing the overall sample size implies minimizing the largest sample size among individual tests, and thus, the solution of this problem is minimax. The form of this solution is an equalizer rule [25], defined in this case as such error spending that equalizes the required sample sizes.
The key in this optimization is minimaxity of the optimal error spending. Indeed, the sample size is determined by the test that requires the largest number of patients, because we need enough data to reach decision for each individual hypothesis. Minimizing the overall sample size implies minimizing the largest sample size among individual tests, and thus, the solution of this problem is minimax. The form of this solution is an equalizer rule [25], defined in this case as such error spending that equalizes the required sample sizes.
We show in this article how the optimal solution can be calculated and derive Bonferroni and Holmtype procedures that follow this minimax rule. Even for the tests where the levels of difficulty are close (but not equal), these new methods may result in substantial cost saving.
Problem Formulation
Consider a sample of multidimensional measurements ( X_{1},,, X_{n}) , where each d i its jth component X _{ij} the jth endpoint for the ith patient, has a marginal distribution with density with respect to some probability measure μ _{j} andθ^{(1)} ,....θ ^{(d )} are parameters of interest. Components of the same observed vector may be correlated; however, we do not assume any knowledge of their joint distribution and use only the marginal distributions for our statistical inference. For example, X_{i} may be vital signs measured on the ith patient or responses of the ith survey participant.
The goal is to conduct d tests of
controlling Type I and Type II familywise error rates
where Τ ⊂{1,...., d}is the index set of true null hypotheses, and F=T is its complement, the index set of false nulls.
In this article, we seek efficient nonsequential multiple testing procedures for (2.1). Under conditions
FWER_{I} ≤α and FWER_{II} ≤ β (2.3)
we aim at minimizing the required sample size n (and therefore, the overall cost of the experiment) by using efficient test statistics and optimal error spending.
A Clinical Trial of Flector
To see the size of potential saving, let us consider a simple case of testing means of two normal distributions
based on a sample of bivariate normal random vectors X_{1},..., X_{n} with mean (θ^{(1)} ,θ^{(2)}) known standard deviations σ^{(1) ,σ (2) , and unknown correlation ρ .}
Such a situation appeared, for example, in the design of a recent clinical trial of Flector, a patch containing a topical treatment of ankle sprains. Patients were randomized to three groups  a brand name Flector patch, its generic version, and placebo. The trial was designed to support two statements  (1) that the generic patch is as effective as the brand name, and (2) that both of them are better than placebo. Thus, test 1 establishes bioequivalence of two treatments and test 2 establishes efficacy, where the two active treatment arms are merged and compared against the placebo arm. By the standard protocol, bioequivalence is established if r, the ratio of threeday mean pain reduction levels between generic and brandname patients, has a 90% confidence interval entirely within the interval [0.8, 1.25]. Since we are actually interested in confirming that the generic patch is at least as efficient as the Flector patch, both tests can be reduced to the form (3.1), whereθ^{(1)} = r (testing r = 0.8 vs r = 1.0) and θ^{(2)} = Δ = μ_{T} −μ_{P} is the difference in the mean pain reduction levels between the merged active treatment group and the placebo group (testing Δ = 0 vs Δ = 4 ) Standard deviations σ ^{(1)} = 0.373 and σ^{(2)} =19.01estimated from the previous studies of similar products such as Lionberger et al. (2011), imply the standardized distances
and thus, the test of efficacy appears more difficult than the test of bioequivalence. As conducted at the actual marginal levels of α_{1} = 0.05,β_{1} = 0.01,α_{2} = 0.05,β_{2} = 0.14 these tests required n = 169 patients in each treatment arm (the actual trial included 170 patients in each arm), and with this sample size, both test statistics appeared approximately normal.
Chosen to control individual error probabilities, this sample size actually suffices to keep the Type I familywise error rate at the same level 1 α = 0.05. The optimal error spending in this case is
0.05 = 0.00002+ 0.04998. (3.2)
That is, it is most efficient to split α_{1} = 0.05very unevenly into α_{1} = 0.00002and α_{2} = 0.04998 , due to different levels of difficulty. In other words, with one test being so much “easier” than the other, the whole trial can be planned to test the most difficult hypothesis, whereas the “easier” test can then be conducted practically at no additional expense, matching the result in Theorem 3.1 of De and Baron [24].
Such an unequal error spending is explained in Figure 1. We see that almost all the error is spent on the more difficult test if δ_{1} /δ_{2} ∉ (0.5, 2.0) i.e., one test is at least twice as difficult as the other (Figure 1).
A simple computation shows that uniform α spending for testing (3.1) with the listed β^{1,2} and FWER_{I} of α = 0.05 requires a sample of size n = 210 in each treatment arm whereas n = 169 suffices with error spending (3.2), using the Bonferroni method
Table 1 shows the optimal error spending of α_{1} = 0.05 and β_{1} = 0.10 for the case of d = 2 tests, with different levels of difficulty δ_{1,2} . Naturally, the optimal split of δ and β becomes more uneven when δ_{2} differs substantially from δ_{1} . For testing θ = 0.25 against θ = 0.3 , the more difficult test already receives more than twothirds of the allowed error probability. When δ_{2} /δ_{1} > 2 the required sample size N = 138 is the same as one needs to conduct just a single test of H^{(1)}_{0}. Thus, optimal error spending allows to add substantially easier tests at practically no extra cost, while controlling the familywise error rates. For the comparison, the uniform error spending and the standard Bonferroni adjustment for multiple comparisons requires N = 201 for each of these tests.
The Holmtype stepwise approach provides further saving (Table 1).
Table 1: Optimal error spending of α_{1} = 0.05and β_{1} = 0.10 for two tests and the required sample size N.
Minimax Error Spending
Minimax problem and equalizer solution
The optimal error spending in Section 3 is calculated by attaining the same sample size that is required to conduct each test in (3.1).
Indeed, as we know (for example, from [26], ch. 4), the minimum sample size needed to test the jth normal mean at levels α_{j} and β _{j}is
where andΦ(⋅) is a standard normal cdf. A conclusive decision on each of d tests requires a sample size n=max (n_{j}). Thus, minimization of the required sample size, minmax (n_{j}) is a minimax problem, and its solution is an equalizer rule (Berger, 1985, ch. 5), which is such error spending { αsub>j, βsub>j} that yields
n^{1} = n^{2} = n^{d} (4.2)
intuitively, optimality of the equalizer testing scheme is natural. Consider error probabilities α_{j}, β _{j} that correspond to unequal sample sizes n_{j} given by (4.1). Then, slowly incrementing error probabilities α_{j} and β_{j} that correspond to the largest sample size n_{j} at the expense of smaller sample sizes, we reduce the overall sample siz n = max (n_{j}). The only situation when such reduction is no longer possible is when all n_{j} are equal, following the (generalized) Bonferroni approach, this minimax problem reduces to solving (4.2) in terms of {α_{j}, β _{j} } and minimizing the common n_{j} among all the existing solutions, subject to Σα_{j} =α and Σβ_{j} = β For the tests of normal means, a convenient solution, close to being minimax, is
where c_{α} is the solution of equation and c_{β} solves . These equations have unique solutions because the function is continuous and monotonically decreasing from d/2 ≥ 1 at t = 0 to 0 at t = +∞ . It follows from (4.1) that error spending (4.3) is an equalizer, although it is the optimal equalizer only when α = β . Why is there more than one equalizer solution? We are choosing (2d) marginal significance levels {α_{j}, β _{j}} for j =1,.....,d under two constraints on Σα_{j} and Σβ_{j} Additional (d − 1) constraints appear in (4.2). Therefore, we have (d+1) equations and (2d) variables to choose, giving us at least one degree of freedom for all d ≥ 2 and a room to minimize the common sample size n_{j} .
General distribution and Bahadur efficiency
For nonnormal distributions, computation of the exact sample size necessary to attain a given significance level and power is “extremely difficult or simply impossible”, hence, a symptotics are being used (Nikitin, 1995, sec. 1.1) [27]. For distributions that are approximately normal, this approximation yields a rather accurate estimation of the necessary sample size [28] (Dzhungurova and Volodin, 2007). Then, error spending (4.3) is nearly optimal, and sample size (4.1) is nearly sufficient for the control of FWER_{I} and FWER_{II} at levels α and β . For the general case, the asymptotic result of Bahadur [29] about pvalue p^{(j)} of the jth test states that
where is the KullbackLeibler information number between H^{(j)}_{A}and H^{(j)}_{0}. Equality in (4.4) is attained by the likelihood ratio test (LRT) that rejects H^{(j)}_{0} for large values of statistic
making this test Bahadur asymptotically optimal (Bahadur, 1967, part II). Since our minimax problem is solved by an equalizer, and since the decision on each test is determined by comparing pvalues p^{(j)} with marginal significance levels α_{j}, this suggests to choose the error spending α_{j} with log( α_{j}) proportional to K^{(j)}_{A} In other words, the Bonferroni procedure for multiple testing that is based on loglikelihood ratio statistics (4.5) for each marginal test, with error spending
cα being the unique solution is asymptotically optimal in Bahadur sense, and it controls the Type I familywise error rate at level α [3035]. Similarly, the Type II error spending with c_{β} solving controls FWER_{II}.
If a sample is sufficiently large, the multiple testing procedurewith the introduced α  and β spending controls both familywise error rates simultaneously. To see this, we notice that in order to control the probability of Type I error, each marginal LRT rejects the corresponding null hypothesis H^{(j)}_{0} if .
By Chebyshev’s inequality,
as n → ∞ for any hence a_{j}(n ) → −∞ .
hencea_{j}( n) ≤ b_{j}(n) for sufficiently large n, which implies that FWER_{I} ≤α and FWER_{II} ≤ β
Generalized Holm method
Instead of comparing marginal pvalues p^{(j)} with α_{j}, Σα_{j} =α Holm [23] (1979) proposed to compare the ordered pvalues
against α levels that are generally larger, with the sum Σα_{j} >α Choosing larger α levels increases the power of tests, or, given the same power, they require a smaller sample. Then, the null hypotheses H^{(j)}_{0} corresponding to the ordered pvalues are arranged in the same order, and are rejected, where . These rejected hypotheses correspond to m most significant pvalues.
This multiple testing procedure controls FWER_{I} ≤α [15] (Holm, 1979).
Holm’s method does not account for different levels of difficulty of tested hypotheses. However, it can be generalized to allow optimal solutions similar to (4.6) in the following way [3641].
Let us order the Kullback Leibler information numbers under the null hypotheses under the alternatives H^{(j)}_{A}. Then, let a_{k} be the unique solution of the equation
Also, consider statistics and order them, q[1] ≥ .... ≥ q[d ] .This order may differ from the ordering of pvalues p[ j] .In the new multiple testing procedure, we compare the ordered values q[ j] against the corresponding critical values j a . Like Holm’s method, the null hypotheses , corresponding to the ordered q[ j] , are rejected for and all H^{(j)}_{0} are accepted (not rejected) if q^{[j]} < a^{j}. for all j. This type of a multiple testing procedure is stepdown because it tests marginal hypotheses in steps, moving from the most significant qvalue to the least significant one, rejecting null hypotheses one at a time, and accepting all the remaining hypotheses once any one of them fails to be rejected.
We show that this multiple testing scheme controls the Type I familywise error rate. First, we notice that the critical values a^{j} are also arranged in a nondecreasing order.
Lemma 1. The critical values k a given as solutions of (4.7) satisfy the inequality, a_{1} ≥.... ≥ a_{d}.
Proof. If a_{k} > a_{k−1} for some k = 2, . . . , d, then we arrive at a contradiction,
Theorem 1. The proposed stepdown multiple testing procedure with critical values aj given by (4.7) for weighted pvalues q^{[j]} controls the Type I familywise error rate, FWER_{I} ≤α
Proof. The proof follows the general idea of Holm [15], adapted to weighted pvalues q^{[j]} and error spending (4.7). Considering the ordered null hypotheses H^{(j)}_{0} , let J be the first index of a true null hypotheses, . In particular, it implies that the first (J −1) null hypotheses are false. Hence, the number of false hypotheses F ≥ J −1.
The next fact to notice is that at least one Type I error occurs if and only if H^{(j)}_{0} is rejected. Indeed, acceptance Of H^{(j)}_{0} means acceptance of all hypotheses H^{(j)}_{0} for j > J, and since all are false, there will be no
Type I errors in this case. Therefore,
Here, the last inequality in (4.9) follows from Lemma 1; (4.10) from the definition of q^{(j)} the first inequality in (4.11) from the inequality (for example, from (1.2.1) of Nikitin, 1995) [29]; the second inequality of (4.11) from the increasing order of K^{[j]} ; and the remainder of (4.11) follows from (4.7) with k = F +1 = d − τ +1.
Naturally, when all tests have the same difficulty level, in terms of , then equation (4.7) is
solved by and the generalized Holm procedure becomes the standard Holm’s as a special case.
As another extreme, suppose that one test is much more difficult than the other tests, namely, Then equation (4.7) is approximated by
from where
Comparison
An extensive study of different scenarios may be required in order to evaluate the range of saving brought by each multiple testing method – Holmtype stepwise versus Bonferroni, weighted versus unweighted, and sequential versus nonsequential, for various distributions. Here, we just consider an illustrative example.
Consider testing two hypotheses about normal means. Observed is a sample of random vectors
Table 2: Required sample size E(T) for sequential multiple testing procedures. The standard Bonferroni and Holm type stepwise tests are compared with their optimized versions designed under minimax error spending.
When the two tests have different levels of difficulty, the optimal error spending brings considerable cost saving, see Table 2. Smaller sample sizes due to the proposed minimax error spending method are seen in columns “Generalized Bonferroni”, compared to the standard Bonferroni method, and “Weighted Stepwise”, compared to the standard stepwise Holm method (Table 2).
The weighting approach brings no saving for the case when , when the two tests have the same level of difficulty. The proposed method is only efficient when tests have different difficulty levels. Saving due to minimax error spending increases as the difference between two tests increases. When the test of is five times as difficult as the test of the minimax approach requires 443 fewer patients (34% saving) to conduct the Bonferroni procedure, and 190 fewer patients (18% saving) to conduct stepwise testing.
Acknowledgement
Research of both authors at American University was funded by the National Science Foundation.
Conflict of interest
No conflict of interest.
References
 O’Brien PC (1984) Procedures for comparing samples with multiple endpoints. Biometrics 40: 10791087.
 Pocock SJ, NL Geller, AA Tsiatis (1987) The analysis of multiple endpoints in clinical trials. Biometrics 43: 487498.
 Tang DI, NL Geller (1999) Closed testing procedures for group sequential clinical trials with multiple endpoints. Biometrics 55: 11881192.
 Wassmer G, W Brannath (2016) Multiple testing in adaptive designs. In Group Sequential and Confirmatory Adaptive Designs in Clinical Trials, Springer, pp. 231239.
 Urschel HC, LL Hanselka, M Baron (2011) A controlled trial of flumazenil and gabapentin for initial treatment of methylamphetamine dependence. J Psychopharmacology 25(2): 254262.
 Urschel HC, LL Hanselka, I Gromov, L White, M Baron (2007) Openlabel study of a proprietary treatment program targeting type aaminobutyric acid receptor dysregulation in methamphetamine dependence. Mayo Clinic Proceedings 82(10): 11701178.
 Hendricks AE, J Dupuis, MW Logue, RH Myers, KL Lunetta (2014) Correction for multiple testing in a gene region. Eur J Hum Genet 22(3): 414418.
 Babron MC, A Etcheto, MH Dizier (2015) A new correction for multiple testing in gene–gene interaction studies. Annals of Human Genetics 79(5): 380384.
 Sul JH, T Raj, S de Jong, PI de Bakker, S Raychaudhuri, et al. (2015) Accurate and fast multipletesting correction in eQTL studies. The American Journal of Human Genetics 96(6): 857868.
 Anderson ML (2012) Multiple inference and gender differences in the effects of early intervention: A reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects. Journal of the American Statistical Association.
 Benjamini Y, F Bretz, S Sarkar (2004) Recent Developments in Multiple Comparison Procedures. Beachwood, Ohio: IMS Lecture Notes  Monograph Series.
 Dmitrienko A, AC Tamhane, e Bretz F (2010) Multiple Testing Problems in Pharmaceutical Statistics. Boca Raton, FL: CRC Press, USA.
 Hsu J (1996) Multiple comparisons: theory and methods. CRC Press.
 Bretz F, T Hothorn, P Westfall (2016) Multiple comparisons using R. CRC Press.
 Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2): 6570.
 Benjamini Y, Y Hochberg (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc 57(1): 289300.
 Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4): 800802.
 Lehmann EL, JP Romano, JP Shaffer (2012) On optimality of stepdown and step up multiple test procedures. In Selected Works of EL Lehmann, Springer, pp. 693717.
 Sarkar SK (2002) Some results on false discovery rate in stepwise multiple testing procedures. Ann Stat 30(1): 239257.
 Bartroff J, TL Lai (2010) Multistage tests of multiple hypotheses. Communications in Statistics  Theory and Methods 39: 15971607.
 De S, M Baron (2012b) Stepup and stepdown methods for testing multiple hypotheses in sequential experiments. J Statist Plann Inference 142: 20592070.
 Bartroff J, J Song (2014) Sequential tests of multiple hypotheses controlling type i and ii familywise error rates. Journal of Statistical Planning and Inference 153: 100114.
 De S, M Baron (2015) Sequential tests controlling generalized familywise error rates. Statistical Methodology 23: 88102.
 De S, M Baron (2012a) Sequential Bonferroni methods for multiple hypothesis testing with strong control of familywise error rates I and II. Sequential Analysis 31(2): 238262.
 Berger JO (1985) Statistical Decision Theory. New York, NY: Springer Verlag, USA.
 Jennison C, BW Turnbull (2000) Group sequential methods with applications to clinical trials. Boca Raton, FL: Chapman & Hall.
 Nikitin Y (1995) Asymptotic efficiency of nonparametric tests. Cambridge University Press.
 Dzhungurova OA, IN Volodin (2007) The asymptotic of the necessary sample size in testing the hypotheses on the shape parameter of a distribution close to the normal one. Russian Mathematics 51(5): 4450.
 Bahadur RR (1967) Rates of convergence of estimates and test statistics. The Annals of Mathematical Statistics 38(2): 303324.
 Baillie DH (1987) Multivariate acceptance sampling  some applications to defence procurement. The Statistician 36(5): 465478.
 Bartroff J (2014) Multiple hypothesis tests controlling generalized error rates for sequential data. arXiv preprint arXiv:1406.5933.
 Blomquist J (2015) Multiple inference and market integration: An application to swedish fish markets. Journal of Agricultural Economics 66(1): 221235.
 Borovkov AA, AA Mogulskii (2001) Limit theorems in the boundary hitting problem for a multidimensional random walk. Siberian Mathematical Journal 42(2): 245270.
 Govindarajulu Z (2004) Sequential Statistics. World Scientific Publishing Co, Singapore.
 Hamilton DC, ML Lesperance (1991) A consulting problem involving bivariate acceptance sampling by variables. Canadian J Stat 19: 109117.
 Landis WG (2003) Twenty years before and hence; ecological risk assessment at multiple scales with multiple stressors and multiple endpoints.
 Lionberger DR, E Joussellin, A Lanzarotti, J Yanchick, M Magelli (2011) Diclofenac epolamine topical patch relieves pain associated with ankle sprain. J Pain Res 4: 4753.
 Park J, CH Jun (2015) A new multivariate EWMA control chart via multiple testing. Journal of Process Control 26: 5155.
 Tartakovsky AG, IV Nikiforov, M Basseville (2014) Sequential Analysis: Hypothesis Testing and ChangePoint Detection. Chapman & Hall/CRC.
 Wald A (1947) Sequential Analysis. New York: Wiley, USA.
 Yang Z, X Zhou, P Zhang (2015) Centralization and innovation performance in an emerging economy: testing the moderating effects. Asia Pacific Journal of Management 32(2): 415442.

Michael I Baron, Laurel M MacMillan. Weighted Statistics for Testing Multiple Endpoints in Clinical Trials. Annal Biostat & Biomed Appli. 2(2): 2019. ABBA.MS.ID.000532.
Error spending; Familywise error rate; Likelihood ratio test; Minimax; Stepwise testing, Weighted Statistics, Testing Multiple Endpoints, Nucleotide polymorphisms, Blood samples, Statistical inference, Multiple testing

This work is licensed under a Creative Commons AttributionNonCommercial 4.0 International License.