


Research Article A sequential test procedure for monitoring a singular safety and efficacy outcome Alan D. HutsonUniversity at Buffalo, Department of Statistics, Farber Hall Room 249A, 3435 Main St., Buffalo, NY 142143000. Code Number: pr03011 ABSTRACT In this note we describe a modification of the sequential probability ratio test (SPRT) developed for the purpose of “flagging” a significant increase in the mortality rate of a treatment relative to a control while ensuring that doubleblinding and the Type I error for the primary test of efficacy, also based on mortality rates, is not compromised. Key words: Interim analysis; safety monitoring; sequential testing. INTRODUCTION Drug trials go through different phases. In phase I trials the primary concern is safety, the subjects are typically healthy volunteer and patient studies, and the primary objective is to determine the maximum tolerated dose (MTD). This is followed by a phase II trial, which build upon the results of the phase I trial. The primary goal of a phase II trial is to determine the optimal method of administration and examine potential efficacy. If the phase II trial demonstrates that the drug may be reasonably safe and potentially effective a phase III trial may be carried forth. The primary goal of a phase III trial is to compare the effectiveness of the new treatment with that of existing treatments or placebo. In a majority of phase III clinical trials subject measurements can be divided into distinct efficacy and safety variables. With the exception of truly sequential trials the primary efficacy variables are usually analyzed at a few well defined points in time (typically based upon calendar time or accrual milestones). In contrast, safety is monitored continuously through the generation of adverse event (AE) reports to the respective internal review board (IRB) and review by the principal investigator (PI), with summary data safety monitoring board (DSMB) reports generated at regular intervals, e.g. quarterly or biyearly. In a majority of highrisk clinical trials the DSMB will also monitor the flow of AE reports in realtime. In the specific case of doubleblind controlled trials of a new treatment versus standard of care, or placebo, how might one monitor safety “continuously” and efficacy at distinct points in time when the primary safety and efficacy variable is mortality. In addition, if the DSMB wishes to remain blinded to treatment assignment unless there is a true safety issue, what type of information is needed in order for them to make an informed decision in conjunction with real time monitoring of AE reports? To generate a formal statistical rule to tackle this problem requires the principal investigators to design the trial around the efficacy outcome in the traditional sense, while the DSMB needs to determine what are the unacceptable differences between the new therapy and control in the opposite (unsafe) direction. In this note we develop a onesided safety monitoring rule based upon a modification of Wald's classical sequential probability ratio test (SPRT)^{1}. This rule is to be used in conjunction with a randomized block design where the hypothesis involving efficacy is a simple comparison of two mortality proportions of the form H0: p_{1 }=p_{2 }versus H_{1}: p_{1 }< p_{2}. The general features of the safety monitoring procedure are as follows:
The motivation of this methodology stemmed from the request of our DSMB to generate a safety monitoring rule based upon statistical methods for the DCAMALA clinical trial of malaria in Ghana^{2}. This trial was designed as a randomized, doubleblind, placebocontrolled singlecenter trial. The trial was originally designed to enroll n=1500 subjects, but was terminated earlier due to financial reasons. For this specific trial of dichloroacetate (DCA) versus placebo the measure of efficacy was 28day mortality, and the primary measure of safety was also 28day mortality. Ultimately the DSMB wanted a simple statement after each block of subjects completed the trial: “remain blinded at this point in time” or that the “blind be broken at this point in time.” Unblinding the study is used to mean group summary statistics will be analyzed and presented in an unblinded fashion. This does not necessarily mean that data at the individual subject level is unblinded. Also note that the principal investigators would remain blinded even if the DSMB were to take an unblinded look at this trial. The use of this rule still allowed us to monitor secondary measures of safety in the standard way^{34}. In addition, adverse events involving mortality were still monitored on a casebycase basis (in a blinded manner). STATISTICAL METHODS We will employ a version of a sequential test for the purpose of generating a safety monitoring rule, which will basically flag a problem for the DSMB with respect to a disproportionate amount of mortalities for a new therapy relative to the standard of care or placebo. Note that this rule does not terminate the clinical trial, it only suggests unblinding the trial for the purpose of more intense scrutiny. The sequential test is based upon a modification of the SPRT, initially developed by Wald^{1}, and is a procedure for testing a simple null hypothesis versus a simple alternative hypothesis continuously in time. With respect to the new safety monitoring plan, “continuously in time” will refer to blocks of subjects who complete the trial as opposed to testing after individual subjects complete the trial. Therefore, in order to implement this rule effectively the design of the trial should be of the form of a randomized block design. Let N=n_{1}+n_{2} denote the total number of subjects for the new treatment plus the standard treatment, K denote the number of blocks, and N=n_{1i}+n_{2i} denote the number of subjects in block i, i=1, 2,···K, where N_{i}=N/K. For the purposes of our safety monitoring plan we set the null “efficacy” hypothesis to correspond to the original study design and the alternative “safety”hypothesis to correspond to unsafe rates of the new therapy relative to control. Let p_{1 }and p_{2 } denote the event rates of the new therapy and control group, respectively. Then the safety data monitoring rule(SDMR) consists of testing H_{0}: θ_{0} = efficacy (p_{1}, p_{2}), H_{1}: θ_{1}= safety(p_{1},p_{2}), after blocks of N_{i}= n_{1i}+ n_{2i} subjects complete the study. The DSMB chairperson is then notified of the results after each test is carried out per block. We strongly recommend that the values for p_{1 }and p_{2 }corresponding to H_{0} be chosen based upon the original study design pertaining to efficacy. For example, in the DCAMALA trial the mortality rates from which the trial was designed were p_{1 }= 0.19 and p_{2}=0.25, for DCA and placebo, respectively. For H_{1}, the “safety” hypothesis, the DSMB with the guidance of a simulation study deemed p_{1}=0.28 to be an unacceptable death rate in the DCA arm given a placebo death rate of p_{2}=0.25, and accounting for statistical noise. The mathematical details for the efficacy and safety “functions” are contained in Appendix A. Through simulations we illustrated that if the placebo death rate is lower than anticipated the decision to recommend unblinding the trial will be earlier given the same relative differences in adverse event rates, e.g. p_{1}=0.18 to p_{2}=0.15 for DCA relative to placebo. Note that the unblinding rule can be easily modified to accommodate other types of outcomes such as mean differences. Test Statistic. Denote the estimates of the proportions p_{1 }and p_{2 }as where i=1, 2,···, K, and again K denotes the number of blocks. The safety monitoring test statistic λ_{i} is then simply a function of and _{} updated after every block i. For the DCAMALA study we set B=75 and n_{1i}=n_{2i}=10 based upon efficacy considerations. Note that n_{1i} and n_{2i} have to be large enough to produce a meaningful value for λ_{i} The details for the calculation of λ are contained in Appendix A. If λ > A then stop and reject H_{0} (recommend unblinding the study). If λ ≤ B then stop and accept H_{0}(reset the monitoring rule), else continue to monitor the safety of the study. The consequences of resetting the monitoring rule are examined further in Section. for our purpose. This is demonstrated via the the simulation study in Section 3. The parameters α and β correspond roughly to traditional fixed sample size Type I and Type II error rates. Hence, we will typically choose β to be much smaller than α if the primary goal is safety monitoring. For the DCAMALA trial we determined that the appropriate levels would be to set α =0.20 and β =10^{8 }such that it would be unlikely that the test terminates earlier and we accept H_{0}, yet the test would terminate quickly if there is a safety concern. Setting α =0.20 and β =10^{8 }corresponds to the stopping bounds of A=9.9999 and B=0.00001. Therefore, if the test errs it will err on the conservative side in terms of unblinding the study early. The parameters α and β may be adjusted by the DSMB in order to relax or tighten the monitoring rule during the course of the study. Using the candidate choices of α =0.20 and β =10^{8 }(approximately null) in the DCAMALA trial it was determined through simulation that λ_{i} would never be less than B prior to the first planned interim analysis at 50% accrual. If this scenario did occur it would have indicated that DCA is substantially more effective than originally anticipated and that the safety of DCA in terms of the mortality rates shouldn't be of concern to the EAC at that current point in time. Therefore, we propose that if λ_{i} <B during any point in the study that the SPRT decision rule be reset at block i1, i.e. restart the safety monitoring rule one block back relative to the current point in time (block i) as the new “time 0.” This provides a one block “burnin” period to reset the rule. Even after resetting the test statistic due to the outstanding performance of the new therapy, a short run of deaths could occur favoring control such that λ reverses its path and crosses A. This rare event (given a longterm past history of treatment efficacy) would not stop the trial, however, the recommendation to unblind the trial at that point in time would be made to the DSMB Chair. It would then have to be determined whether this run was due to chance alone, or some deterministic cause such as a bad batch of drug. Simulation Study The following simulation study is used to illustrate the proportion of times out of 10,000 simulations that the decision to “remain blinded”or “unblind the study” would occur over possible choices of α =0.05,0.10,0.20 and β =10^{7}, 0.05, 0.10 for a clinical trial with sample size fixed at n=1500, broken into K=75 blocks. The decision to “reset the trial” the trial, as described above, is builtin to the simulation study. The numbers are similar to the operating characteristics of the DCAMALA trial, however, there are no planned interim analyses. In addition, the median time that λ crosses A or B is given. For any specific trial of interest a similar simulation study should be undertaken in order to determine the appropriate parameter values for α and β . In this specific simulation study the efficacy hypothesis θ_{0}(p_{1},p_{2}) was fixed at θ_{0}(0.20, 0.25) as determined by a hypothetical trial design. Assume that the DSMB decided the safety hypothesis should be θ_{1}(0.30, 0.25). The simulations were then carried out given different scenarios of “true” mortality rates (p_{1},p_{2}). The pairs (p_{1},p_{2}) were set to (0.15,0.25), (0.20,0.25), (0.25,0.25), (0.30,0.25), (0.20,0.15), and (0.15,0.10), corresponding to the new treatment being more efficacious than planned, the new treatment being efficacious as planned, the new treatment being equivalent to placebo, the new treatment being worse than placebo at the correct placebo rate, the new treatment being worse than placebo at a lower placebo mortality rate, and the new treatment being worse than placebo at a very low and unexpected placebo mortality rate, respectively. The simulation results are provided in Tables 1, 2, 3, 4, 5, 6, 7, 8, 9. Since we are primarily interested in testing for safety the choice of α and β comes down to a tradeoff between stopping times and Type I error for this application. The column labeled “Median Sample Size” indicates the median time at which a decision rule is to be implemented if the true underlying mortality rates were p_{1 }and p_{2}. If we were confident of the underlying truth with respect to p_{1 }and p_{2 }in terms of efficacy then it becomes a question of tradeoffs. Safety and efficacy outcome What safety error rate is the DSMB willing to live with. For example from Table 1, if the DSMB chose α =0.05 and β =10^{7 }then we would indicate to the DSMB that they would likely “unblind” the trial early 6.3% of the time over theoretical repetitions of the study, and “unblind” the trial early 97.6% of the time if there was a true safety concern. Note that if the new treatment was more successful than anticipated the “unblinding rate” goes down to 0.5%. If the DSMB wanted a very stringent rule then they might go with α =0.2 and β =10^{7}, e.g. see Table 3. A possible tradeoff would be to choose α =0.1 and β =0.1 in Table 8, where we would unblind the study at rate 12.3% if we were close to the proportions from which the study was planned around, and unblind the study 98.9% of the time if we were close to the safety hypothesis. DCAMALA Trial Example The following text appeared(modified slightly for this paper) in each DCAMALA DSMB report following the adoption of the sequential test called the safety decision monitoring rule(SDMR) in the DCAMALA trial. “After every 20 subjects (one randomization blocking unit) have completed the study the biostatistics coordinating center will “update” the SDMR and recommend to the DSMB to either remain blinded to the treatment assignment or to unblind the treatment assignment due to a significant increase in the mortality rate, beyond random noise, of the DCA group relative to the placebo group. In addition to the SDMR, mortality data will always be monitored on casebycase basis, i.e. if a sequential series of anomalous deaths occur in any given block the DSMB will be notified immediately, regardless of the SDMR”. The original trial was designed to enroll n=1500 subjects if accrual through two interim analysis reached 100%. Unfortunately, the trial was terminated early after only n=123 subjects had completed the study due to problems stemming from failing to meet accrual milestones set by the sponsors. The low accrual rates were directly related to unusual dry spells occurring during the rainy seasons when malaria is prevalent. However, there was enough data gathered in order to demonstrate how the SDMR rule works in reality. In order to illustrate the new SDMR to the DCAMALA DSMB prior to their approval or disapproval, the following examples were presented to the committee members. The method was illustrated using various “madeup” outcomes, which were similar to what we anticipated might possibly occur during the DCAMALA trial given H_{0}: θ_{0} = efficacy (p_{1 }= .19, p_{2 }= .25), H_{1}: θ _{1 }= safety(p_{1 }= .28, p_{2 }= .25), The results are presented in Tables 10, 11, 12, 13, 14. After every N_{i}=10+10 subjects completed the study λ_{i} was calculated along with the corresponding recommendation: “Remain Blinded” or “Unblind the Study.” The example given in Table 14 is the only case where DCA mortality was consistently lower than placebo mortality and hence the decision was always to “Remain Blinded.” In all other examples the decision to unblind the study was a function of the true underlying mortality rates. In our opinion these simple examples provided to the committee helped illustrate the utility of the SDMR and thus they ultimately endorsed its implementation. Results for the DCAMALA Trial In this section we illustrate how the SPRT decision rule worked within the context of the DCAMALA trial through 120 subjects given α =0.20 and β =10^{8 }corresponding to boundaries of A=9.9999999 and B=0.0000111. The trial was terminated due to financial circumstances after n=123 subjects were enrolled. Hence, the final 3 subject’s data were not included in the safety monitoring statistic illustrated here. Let _{}and denote the estimated mortality rates in the DCA and placebo treatment groups for each block of 20 subjects. As can be seen the value of λ_{i} started to “drift” toward A=9.9999999 as the imbalance in mortality rates favored DCA and then started to “drift” back towards B=0.0000111 as the mortality rates became more balanced. The simplicity of programming this method is illustrated via the SAS program used to carry out the calculations given in Appendix B. CONCLUSION In this note we presented a statistical decision rule for data safety monitoring purposes when the primary efficacy and primary safety endpoint of a clinical trial is mortality. This rule was designed for ease of interpretation by DSMB members with little or no formal statistical training. The goal of this method is to provide a means of controlling the approximate Type I error control for the efficacy analysis, while monitoring safety in a continuous fashion. As was discussed above, the method may be modified to accommodate more complex designs. Future work will involve studying the probability theory behind the utilization of different sequential bounds for efficacy and safety such this information can be incorporated into the sample size parameters during the design phase of the trial. ACKNOWLEDGEMENTS Special thanks to David Harrington for suggesting developing many of the ideas included in this manuscript and for Sandy Zientek in the manuscript preparation. Dr. Hutson’s work is partially supported by a NYSTAR Faculty Development Grant. REFERENCES
Copyright @20022004. TJPR Faculty of Pharmacy, University of Benin, Benin City, Nigeria The following images related to this document are available:Photo images[pr03011t6.jpg] [pr03011t4.jpg] [pr03011t13.jpg] [pr03011t12.jpg] [pr03011t10.jpg] [pr03011t2.jpg] [pr03011a2.jpg] [pr03011t14.jpg] [pr03011t1.jpg] [pr03011t7.jpg] [pr03011t5.jpg] [pr03011t9.jpg] [pr03011t11.jpg] [pr03011t3.jpg] [pr03011t8.jpg] [pr03011a1.jpg] [pr03011t15.jpg] 
