|
Indian Journal of Dermatology, Venereology, Leprology, Vol. 70, No. 2, March-April, 2004, pp. 123-128 Research Methdology Sample size and power analysis in medical research Zodpey Sanjay P Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra Code Number: dv04041 ABSTRACT Among the questions that a researcher should ask when planning a study is "How large a sample do I need?" If the sample size is too small, even a well conducted study may fail to answer its research question, may fail to detect important effects or associations, or may estimate those effects or associations too imprecisely. Similarly, if the sample size is too large, the study will be more difficult and costly, and may even lead to a loss in accuracy. Hence, optimum sample size is an essential component of any research. When the estimated sample size can not be included in a study, post-hoc power analysis should be carried out. Approaches for estimating sample size and performing power analysis depend primarily on the study design and the main outcome measure of the study. There are distinct approaches for calculating sample size for different study designs and different outcome measures. Additionally, there are also different procedures for calculating sample size for two approaches of drawing statistical inference from the study results, i.e. confidence interval approach and test of significance approach. This article describes some commonly used terms, which need to be specified for a formal sample size calculation. Examples for four procedures (use of formulae, readymade tables, nomograms, and computer software), which are conventionally used for calculating sample size, are also givenINTRODUCTION Medical researchers primarily consult bio-statisticians for two reasons. Firstly, they want to know how many subjects should be included in their study (sample size) and how these subjects should be selected (sampling methods). Secondly, they desire to attribute a p value to their results to claim significance of results. Both these bio-statistical issues are interrelated. If a study does not have an optimum sample size, the significance of the results in reality (true differences) may not be detected. This implies that the study would lack power to detect the significance of differences because of inadequate sample size.[1] Whatever outstanding results the study produces, if the sample size is inadequate their validity would be questioned. If the sample size is too small (less than the optimum sample size), even the most rigorously executed study may fail to answer its research question, may fail to detect important effects or associations, or may estimate those effects or associations too imprecisely. Similarly, if the sample size is too large (more than the optimum size), the study will be more difficult and costly, and may even lead to a loss in accuracy, as it is often difficult to maintain high data quality. Hence, it is necessary to estimate the optimum sample size for each individual study.[1] For these reasons, in recent years, medical literature has focused increasing attention on sample size requirements in medical research[2] and peer reviewed journals seriously look for the appropriateness of sample size in their manuscript review process. Basically, the issue of sample size can be addressed at two stages of the actual conduct of the study. Firstly, one can calculate the optimum sample size required during the planning stage, while designing the study, using appropriate approaches and information on some parameters. Secondly, the issue of sample size can be addressed through post-hoc power analysis at the stage of interpretation of the results. In practice, the size of a study is often restricted because of limited financial resources, availability of cases (rare diseases) and time limitation. In these situations the researcher completes the study using the available samples and performs post-hoc power analysis.[1] It is also important to note that the requirement for estimating the sample size depends primarily on the study design and the main outcome measure of the study. There are various study design options available for conducting medical research. A medical researcher needs to select an appropriate study design to answer the research question. There are many different approaches for calculating the sample size for different study designs. For example, the procedure of calculating the sample size is different for a case-control design than for a cohort design. Similarly, there are different approaches for calculating the sample size for cross-sectional studies, clinical trials, diagnostic test studies, etc. Moreover, within each study design there could be more sub-designs and the sample size calculation approach would vary accordingly. For case-control studies, the approach for calculating the sample size is distinct for matched and un-matched designs. Hence, one must use the correct approach for computing the sample size appropriate to the study design and its subtype.[1] The second important issue that should be considered while computing the sample size is the primary outcome measure. The primary outcome measure is usually reflected in the primary research question of the study and also depends on the study design. For estimating the risk in a case-control study the primary outcome measure would be the odds ratio, but while estimating the risk in a cohort study it would be the relative risk. In a case-control study, the primary outcome measure could be the difference in means/proportions of exposure in cases and controls, crude odds ratio, adjusted odds ratio, attributable risk, population attributable risk, prevented fraction, etc. While calculating the sample size, one of these primary outcome measures has to be specified since there are distinct approaches for calculating the sample size for each of these outcomes.[3] Similarly, for each study design there could be many outcomes and a researcher needs to specify the main outcome measure of the study. For drawing a statistical inference from the study results two approaches are used: estimation (confidence interval approach) and hypothesis testing (test of significance approach). The procedures for calculating the sample size for these two approaches differ and are available in the literature.[1],[2],[4],[5] A researcher needs to select the appropriate procedure for computing the sample size and accordingly use the approach of drawing a statistical inference subsequently. Moreover, one also needs to specify some additional parameters depending upon the approach chosen for calculating the sample size. They are hypothesis (one or two tailed), precision, type I error, type II error, power, effect size, design effect, etc. For understanding the principles of sample size calculation and power analysis, one should have an understanding of these commonly used terms. Description of some commonly used terms[1] Random errorIt describes the role of chance, particularly when the effects of explanatory or predictive factors have already been taken into account. Sources of random error include sampling variability, subject to subject differences and measurement errors. It can be controlled and reduced to acceptably low levels by averaging, increasing the sample size and by repeating the experiment. Systematic error (Bias) Precision (Reliability) Accuracy (Validity) Null hypothesis Alternative hypothesis Type I (a) error Type II (b) error Power (1-b) Effect size Design effect Procedures for calculating the sample size Use of formulae for sample size calculation and power analysis To investigate the role of oral contraceptives (OC) in the etiology of cutaneous malignant melanoma in women, an unmatched case-control study is to be undertaken. For calculating the sample size for this study using formulae,[3] the following parameters have to be specified: Formula[3] n = 2p′ q′(Za+ Zb)[2] / (p1 - p0)[2] Solution (by putting above specified values in the formula): n = 188 in each group. If we decide to study only 50 cases and 50 controls, then with the other specifications unchanged, the power of the study would be as follows. Formula:3Zb = {[sqrt(n(p1-p0)2 - [Za sqrt (2p′q′)]} / {sqrt (p1q1+p0q0)} The power is determined from tables of the normal distribution by finding the probability with which the calculated value of Zb is not exceeded. Solution (by putting the above specified values in the formula): Zb = -1.13. From tables of the normal probability function, one finds, Power = p (Z £ -1.13) = 0.13. Thus if the odds ratio in the target population is 2, a case-control study of n = 50 per group has only a 13% chance of finding that the sample estimate will be significantly (a = 0.05) different from unity. Use of readymade tables for sample size calculation[1],[2],[3],[4],[5] Use of nomograms for sample size calculation[6],[7] The desired percentage change is located on a horizontal axis of the nomogram (x line, [Figure - 1]. A vertical line is extended to intersect with the diagonal line corresponding to the response rate in the control group. If the appropriate diagonal line does not extend far enough to intersect with this vertical line, one can try using the other treatment group as the control group. The symmetrical design of the nomogram allows an arbitrary designation of control group. Finally, a horizontal line is extended from this point to the vertical axis, showing the sample size required for both the treatment and control groups. Example[6] A study randomly allocates patients with an infectious disease to treatment with drug A or drug B. The study reports a 40% cure rate using drug A, the current standard therapy, and a 45% cure rate using drug B, a new drug. The study concludes that there is no statistically significant difference in response rates between the two drugs. There are 150 patients in each treatment group.A researcher, who is reading this study, believes that previous studies suggest a better response rate in patients treated with drug B. He decides that a 25% improvement in the usual response rate from drug A, from 40% to 50%, would be important for him. He does not consider a smaller difference to be clinically important. Using the nomogram, he finds that the sample size required to detect a 25% difference in cure rate between drug A and drug B, assuming a control group cure rate of 40%, is about 370 (line x, [Figure - 1]. This is the sample size that ensures an 80% chance of detecting this difference if it exists, assuming a of 0.05. Because there are only 150 patients in each treatment group, the sample size is clearly inadequate; it is not large enough to be sure that a clinically important 25% difference in cure rates does not exist. The researcher, therefore, feels justified in continuing to prescribe drug B since previous evidence suggests that it is more effective and the new study, despite its negative results, is too small to refute this evidence. A separate nomogram is available for continuous variables.[6] Both these nomograms are intended to provide the clinician with a handy and easy-to-use reference for ascertaining whether an apparently negative study has a sample size adequate to detect reliably any important difference between treatment groups. Use of computer software for sample size calculation and power analysis REFERENCES
Copyright 2004 - Indian Journal of Dermatology, Venereology, Leprology The following images related to this document are available:Photo images[dv04041t2.jpg] [dv04041f1.jpg] [dv04041t1.jpg] [dv04041t3.jpg] |
|