About Bioline  All Journals  Testimonials  Membership  News  Donations

Tropical Journal of Pharmaceutical Research
Pharmacotherapy Group, Faculty of Pharmacy, University of Benin, Benin City, Nigeria
ISSN: 1596-5996 EISSN: 1596-9827
Vol. 2, Num. 2, 2003, pp. 197-206
Tropical Journal of Pharmaceutical Research, Vol. 2, No. 2, Dec 2003, pp. 197-206

Research Article

A sequential test procedure for monitoring a singular safety and efficacy outcome

Alan D. Hutson

University at Buffalo, Department of Statistics, Farber Hall Room 249A, 3435 Main St., Buffalo, NY 14214-3000.
To whom correspondence should be addressed: E-mail:

Code Number: pr03011


In this note we describe a modification of the sequential probability ratio test (SPRT) developed for the purpose of “flagging” a significant increase in the mortality rate of a treatment relative to a control while ensuring that double-blinding and the Type I error for the primary test of efficacy, also based on mortality rates, is not compromised.

Key words: Interim analysis; safety monitoring; sequential testing.


Drug trials go through different phases. In phase I trials the primary concern is safety, the subjects are typically healthy volunteer and patient studies, and the primary objective is to determine the maximum tolerated dose (MTD). This is followed by a phase II trial, which build upon the results of the phase I trial. The primary goal of a phase II trial is to determine the optimal method of administration and examine potential efficacy. If the phase II trial demonstrates that the drug may be reasonably safe and potentially effective a phase III trial may be carried forth. The primary goal of a phase III trial is to compare the effectiveness of the new treatment with that of existing treatments or placebo.

In a majority of phase III clinical trials subject measurements can be divided into distinct efficacy and safety variables. With the exception of truly sequential trials the primary efficacy variables are usually analyzed at a few well defined points in time (typically based upon calendar time or accrual milestones). In contrast, safety is monitored continuously through the generation of adverse event (AE) reports to the respective internal review board (IRB) and review by the principal investigator (PI), with summary data safety monitoring board (DSMB) reports generated at regular intervals, e.g. quarterly or bi-yearly. In a majority of high-risk clinical trials the DSMB will also monitor the flow of AE reports in real-time.

In the specific case of double-blind controlled trials of a new treatment versus standard of care, or placebo, how might one monitor safety “continuously” and efficacy at distinct points in time when the primary safety and efficacy variable is mortality. In addition, if the DSMB wishes to remain blinded to treatment assignment unless there is a true safety issue, what type of information is needed in order for them to make an informed decision in conjunction with real time monitoring of AE reports? To generate a formal statistical rule to tackle this problem requires the principal investigators to design the trial around the efficacy outcome in the traditional sense, while the DSMB needs to determine what are the unacceptable differences between the new therapy and control in the opposite (unsafe) direction.

In this note we develop a one-sided safety monitoring rule based upon a modification of Wald's classical sequential probability ratio test (SPRT)1. This rule is to be used in conjunction with a randomized block design where the hypothesis involving efficacy is a simple comparison of two mortality proportions of the form H0: p1 =p2 versus H1: p1 < p2. The general features of the safety monitoring procedure are as follows:

  1. Mortality can be evaluated within a fixed time period shortly following treatment,
  2. It is one-sided in the sense that a safety problem is not flagged if the new treatment appears efficacious,
  3. Requires the input of members of the DSMB to determine what an unsafe rate might be for the new therapy relative to control given the design parameters of the trial,
  4. Is continuously sequential in terms of blocks of subjects being the unit of time,
  5. It is easily explained to non-statisticians through the use of examples,
  6. Maintains the Type I error control for the test of efficacy conditioned upon the relative safety of the experimental therapy.

The motivation of this methodology stemmed from the request of our DSMB to generate a safety monitoring rule based upon statistical methods for the DCA-MALA clinical trial of malaria in Ghana2. This trial was designed as a randomized, double-blind, placebo-controlled single-center trial. The trial was originally designed to enroll n=1500 subjects, but was terminated earlier due to financial reasons. For this specific trial of dichloroacetate (DCA) versus placebo the measure of efficacy was 28-day mortality, and the primary measure of safety was also 28-day mortality. Ultimately the DSMB wanted a simple statement after each block of subjects completed the trial: “remain blinded at this point in time” or that the “blind be broken at this point in time.” Unblinding the study is used to mean group summary statistics will be analyzed and presented in an unblinded fashion. This does not necessarily mean that data at the individual subject level is unblinded. Also note that the principal investigators would remain blinded even if the DSMB were to take an unblinded look at this trial. The use of this rule still allowed us to monitor secondary measures of safety in the standard way3-4. In addition, adverse events involving mortality were still monitored on a case-by-case basis (in a blinded manner).


We will employ a version of a sequential test for the purpose of generating a safety monitoring rule, which will basically flag a problem for the DSMB with respect to a disproportionate amount of mortalities for a new therapy relative to the standard of care or placebo. Note that this rule does not terminate the clinical trial, it only suggests unblinding the trial for the purpose of more intense scrutiny. The sequential test is based upon a modification of the SPRT, initially developed by Wald1, and is a procedure for testing a simple null hypothesis versus a simple alternative hypothesis continuously in time. With respect to the new safety monitoring plan, “continuously in time” will refer to blocks of subjects who complete the trial as opposed to testing after individual subjects complete the trial. Therefore, in order to implement this rule effectively the design of the trial should be of the form of a randomized block design. Let N=n1+n2 denote the total number of subjects for the new treatment plus the standard treatment, K denote the number of blocks, and N=n1i+n2i denote the number of subjects in block i, i=1, 2,···K, where Ni=N/K.

For the purposes of our safety monitoring plan we set the null “efficacy” hypothesis to correspond to the original study design and the alternative “safety”hypothesis to correspond to unsafe rates of the new therapy relative to control. Let p1 and p2 denote the event rates of the new therapy and control group, respectively. Then the safety data monitoring rule(SDMR) consists of testing

H0: θ0 = efficacy (p1, p2),

H1: θ1= safety(p1,p2),

after blocks of Ni= n1i+ n2i subjects complete the study. The DSMB chairperson is then notified of the results after each test is carried out per block. We strongly recommend that the values for p1 and p2 corresponding to H0 be chosen based upon the original study design pertaining to efficacy. For example, in the DCA-MALA trial the mortality rates from which the trial was designed were p1 = 0.19 and p2=0.25, for DCA and placebo, respectively. For H1, the “safety” hypothesis, the DSMB with the guidance of a simulation study deemed p1=0.28 to be an unacceptable death rate in the DCA arm given a placebo death rate of p2=0.25, and accounting for statistical noise. The mathematical details for the efficacy and safety “functions” are contained in Appendix A. Through simulations we illustrated that if the placebo death rate is lower than anticipated the decision to recommend unblinding the trial will be earlier given the same relative differences in adverse event rates, e.g. p1=0.18 to p2=0.15 for DCA relative to placebo. Note that the unblinding rule can be easily modified to accommodate other types of outcomes such as mean differences.

Test Statistic. Denote the estimates of the proportions p1 and p2 as

where i=1, 2,···, K, and again K denotes the number of blocks. The safety monitoring test statistic λi is then simply a function of and updated after every block i. For the DCA-MALA study we set B=75 and n1i=n2i=10 based upon efficacy considerations. Note that n1i and n2i have to be large enough to produce a meaningful value for λi The details for the calculation of λ are contained in Appendix A. If λ > A then stop and reject H0 (recommend unblinding the study). If λ ≤ B then stop and accept H0(reset the monitoring rule), else continue to monitor the safety of the study. The consequences of resetting the monitoring rule are examined further in Section.

for our purpose. This is demonstrated via the the simulation study in Section 3. The parameters α and β correspond roughly to traditional fixed sample size Type I and Type II error rates. Hence, we will typically choose β to be much smaller than α if the primary goal is safety monitoring. For the DCA-MALA trial we determined that the appropriate levels would be to set α =0.20 and β =10-8 such that it would be unlikely that the test terminates earlier and we accept H0, yet the test would terminate quickly if there is a safety concern. Setting α =0.20 and β =10-8 corresponds to the stopping bounds of A=9.9999 and B=0.00001. Therefore, if the test errs it will err on the conservative side in terms of unblinding the study early. The parameters α and β may be adjusted by the DSMB in order to relax or tighten the monitoring rule during the course of the study.

Using the candidate choices of α =0.20 and β =10-8 (approximately null) in the DCAMALA trial it was determined through simulation that λi would never be less than B prior to the first planned interim analysis at 50% accrual. If this scenario did occur it would have indicated that DCA is substantially more effective than originally anticipated and that the safety of DCA in terms of the mortality rates shouldn't be of concern to the EAC at that current point in time. Therefore, we propose that if λi <B during any point in the study that the SPRT decision rule be reset at block i-1, i.e. restart the safety monitoring rule one block back relative to the current point in time (block i) as the new “time 0.” This provides a one block “burn-in” period to reset the rule. Even after resetting the test statistic due to the outstanding performance of the new therapy, a short run of deaths could occur favoring control such that λ reverses its path and crosses A. This rare event (given a long-term past history of treatment efficacy) would not stop the trial, however, the recommendation to unblind the trial at that point in time would be made to the DSMB Chair. It would then have to be determined whether this run was due to chance alone, or some deterministic cause such as a bad batch of drug.

Simulation Study

The following simulation study is used to illustrate the proportion of times out of 10,000 simulations that the decision to “remain blinded”or “unblind the study” would occur over possible choices of α =0.05,0.10,0.20 and β =10-7, 0.05, 0.10 for a clinical trial with sample size fixed at n=1500, broken into K=75 blocks. The decision to “reset the trial” the trial, as described above, is built-in to the simulation study. The numbers are similar to the operating characteristics of the DCA-MALA trial, however, there are no planned interim analyses. In addition, the median time that λ crosses A or B is given. For any specific trial of interest a similar simulation study should be undertaken in order to determine the appropriate parameter values for α and β .

In this specific simulation study the efficacy hypothesis θ0(p1,p2) was fixed at θ0(0.20, 0.25) as determined by a hypothetical trial design. Assume that the DSMB decided the safety hypothesis should be θ1(0.30, 0.25). The simulations were then carried out given different scenarios of “true” mortality rates (p1,p2). The pairs (p1,p2) were set to (0.15,0.25), (0.20,0.25), (0.25,0.25), (0.30,0.25), (0.20,0.15), and (0.15,0.10), corresponding to the new treatment being more efficacious than planned, the new treatment being efficacious as planned, the new treatment being equivalent to placebo, the new treatment being worse than placebo at the correct placebo rate, the new treatment being worse than placebo at a lower placebo mortality rate, and the new treatment being worse than placebo at a very low and unexpected placebo mortality rate, respectively.

The simulation results are provided in Tables 1, 2, 3, 4, 5, 6, 7, 8, 9. Since we are primarily interested in testing for safety the choice of α and β comes down to a tradeoff between stopping times and Type I error for this application. The column labeled “Median Sample Size” indicates the median time at which a decision rule is to be implemented if the true underlying mortality rates were p1 and p2. If we were confident of the underlying truth with respect to p1 and p2 in terms of efficacy then it becomes a question of trade-offs.

Safety and efficacy outcome

What safety error rate is the DSMB willing to live with. For example from Table 1, if the DSMB chose α =0.05 and β =10-7 then we would indicate to the DSMB that they would likely “unblind” the trial early 6.3% of the time over theoretical repetitions of the study, and “unblind” the trial early 97.6% of the time if there was a true safety concern. Note that if the new treatment was more successful than anticipated the “unblinding rate” goes down to 0.5%. If the DSMB wanted a very stringent rule then they might go with α =0.2 and β =10-7, e.g. see Table 3. A possible tradeoff would be to choose α =0.1 and β =0.1 in Table 8, where we would unblind the study at rate 12.3% if we were close to the proportions from which the study was planned around, and unblind the study 98.9% of the time if we were close to the safety hypothesis.

DCA-MALA Trial Example

The following text appeared(modified slightly for this paper) in each DCA-MALA DSMB report following the adoption of the sequential test called the safety decision monitoring rule(SDMR) in the DCA-MALA trial. “After every 20 subjects (one randomization blocking unit) have completed the study the biostatistics coordinating center will “update” the SDMR and recommend to the DSMB to either remain blinded to the treatment assignment or to unblind the treatment assignment due to a significant increase in the mortality rate, beyond random noise, of the DCA group relative to the placebo group. In addition to the SDMR, mortality data will always be monitored on case-by-case basis, i.e. if a sequential series of anomalous deaths occur in any given block the DSMB will be notified immediately, regardless of the SDMR”.

The original trial was designed to enroll n=1500 subjects if accrual through two interim analysis reached 100%. Unfortunately, the trial was terminated early after only n=123 subjects had completed the study due to problems stemming from failing to meet accrual milestones set by the sponsors. The low accrual rates were directly related to unusual dry spells occurring during the rainy seasons when malaria is prevalent. However, there was enough data gathered in order to demonstrate how the SDMR rule works in reality.

In order to illustrate the new SDMR to the DCA-MALA DSMB prior to their approval or disapproval, the following examples were presented to the committee members. The method was illustrated using various “madeup” outcomes, which were similar to what we anticipated might possibly occur during the DCA-MALA trial given

H0: θ0 = efficacy (p1 = .19, p2 = .25),

H1: θ 1 = safety(p1 = .28, p2 = .25),

The results are presented in Tables 10, 11, 12, 13, 14. After every Ni=10+10 subjects completed the study λi was calculated along with the corresponding recommendation: “Remain Blinded” or “Unblind the Study.” The example given in Table 14 is the only case where DCA mortality was consistently lower than placebo mortality and hence the decision was always to “Remain Blinded.” In all other examples the decision to unblind the study was a function of the true underlying mortality rates. In our opinion these simple examples provided to the committee helped illustrate the utility of the SDMR and thus they ultimately endorsed its implementation.

Table 15

Results for the DCA-MALA Trial

In this section we illustrate how the SPRT decision rule worked within the context of the DCA-MALA trial through 120 subjects given α =0.20 and β =10-8 corresponding to boundaries of A=9.9999999 and B=0.0000111. The trial was terminated due to financial circumstances after n=123 subjects were enrolled. Hence, the final 3 subject’s data were not included in the safety monitoring statistic illustrated here.

Let and denote the estimated mortality rates in the DCA and placebo treatment groups for each block of 20 subjects. As can be seen the value of λi started to “drift” toward A=9.9999999 as the imbalance in mortality rates favored DCA and then started to “drift” back towards B=0.0000111 as the mortality rates became more balanced. The simplicity of programming this method is illustrated via the SAS program used to carry out the calculations given in Appendix B.


In this note we presented a statistical decision rule for data safety monitoring purposes when the primary efficacy and primary safety endpoint of a clinical trial is mortality. This rule was designed for ease of interpretation by DSMB members with little or no formal statistical training. The goal of this method is to provide a means of controlling the approximate Type I error control for the efficacy analysis, while monitoring safety in a continuous fashion. As was discussed above, the method may be modified to accommodate more complex designs. Future work will involve studying the probability theory behind the utilization of different sequential bounds for efficacy and safety such this information can be incorporated into the sample size parameters during the design phase of the trial.

Appendix 1, 2


Special thanks to David Harrington for suggesting developing many of the ideas included in this manuscript and for Sandy Zientek in the manuscript preparation. Dr. Hutson’s work is partially supported by a NYSTAR Faculty Development Grant.

  1. Abraham Wald. Sequential Analysis. John Wiley & Sons Inc., New York, 1947.
  2. Krishna, S., Nagaraja, R., Planche, T., Agbenyega, T., Bedo-Addo, G., Ansong, D., Owusa-Oforia, A., Shroads, A. L., Henderson, G., Hutson, A., Derendorf, H., Stacpoole, S. Population pharmacokinetics of intramuscular quinine in children with severe malaria. J Clin Endocrinol Metabol 2000; 85: 1569-76.
  3. Expert Working Group (Efficacy) of the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). Guideline for Industry, Structure and Content of Clinical Study Reports, July 1996.
  4. U.S. Department of Health and Human Services. Food and Drug Administration. International Conference on Harmonization; Guidance on Statistical Principles in Clinical Trials; Availability. Federal Register 1998; 63:49583-49598.

Copyright @2002-2004. TJPR Faculty of Pharmacy, University of Benin, Benin City, Nigeria

The following images related to this document are available:

Photo images

[pr03011t6.jpg] [pr03011t4.jpg] [pr03011t13.jpg] [pr03011t12.jpg] [pr03011t10.jpg] [pr03011t2.jpg] [pr03011a2.jpg] [pr03011t14.jpg] [pr03011t1.jpg] [pr03011t7.jpg] [pr03011t5.jpg] [pr03011t9.jpg] [pr03011t11.jpg] [pr03011t3.jpg] [pr03011t8.jpg] [pr03011a1.jpg] [pr03011t15.jpg]
Home Faq Resources Email Bioline
© Bioline International, 1989 - 2022, Site last up-dated on 11-May-2022.
Site created and maintained by the Reference Center on Environmental Information, CRIA, Brazil
System hosted by the Internet Data Center of Rede Nacional de Ensino e Pesquisa, RNP, Brazil