What Power Should Be Used for Power Analysis?


  • Áron Bautista Soldevila Eötvös Loránd University


Power analysis is a statistical method that describes the probability that the results of an experiment will be statistically significant (the probability of avoiding a false negative). Choosing a power level, however is most of the time based on convention, often following Cohen’s recommended 80% [1]. Determining what power should be used could impact scientific practice and funding promoting more rigorous research methods, as, for example, within subject design experiments tend to require fewer participants to detect effect of comparable size [2]. In this literature review, I try to find a basis for choosing power and describe what parameters one should take into consideration when using the method.

 Four parameters of statistical inference can be described, they are closely related, so knowing 3 of them will determine the fourth. These parameters are power, significance criterion (α), sample size (N), and (target) effect size (ꟙ), and so there are four types of power analysis [1]. The difficulties in determining the power we wish to use lie in the fact that when determining these parameters ahead of time, most of the time, the effect size will be set arbitrarily. This is an issue, as usually the researcher does not wish to detect an effect size expressed in an exact numeric value, in addition, this is the parameter that most heavily affects power, and through it, sample size. This means that larger effects are going to produce higher power at smaller sample sizes as compared to smaller effects producing larger power only in large sample sizes [3]. Perhaps a more data driven way to determine effect sizes could involve metanalyses reviewing previously found effect sizes. I conclude that it is difficult to determine a uniform way to choose the effect size ahead of time, and so choosing a power level is going to be to some degree arbitrary, based on the previous experience of the scientist and their expectations of the results. It should be fine to keep Cohen’s 80% suggestion as a minimum value, but the scientist should decide what the relative cost of a false positive and false negative is and determine the power based on that.


[1] J. Cohen, Statistical Power Analysis for the Behavioral Sciences. L. Erlbaum Associates, 1988.

[2] V. A. Thompson and J. I. D. Campbell, "A power struggle: Between- vs. within-subjects designs in deductive reasoning research," Psychol. Int. J. Psychol. Orient, vol. 47, pp. 277–296, 2004. doi:10.2117/psysoc.2004.277.

[3] G. Cumming, Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. Routledge, 2013.