Effect Sizes

What is an effect size?

Put simply, the effect size measures the size of an effect. For example, if one study group has had a cognitive treatment and the other group has no treatment, then the effect size would measure the effectiveness of the cognitive treatment. In other words, the effect size tells us how much more effective the cognitive treatment was compared to no treatment.

Why do we need effect sizes?

Most studies use a measure of statistical significance (i.e. the observed differences are not due to chance) to endorse their findings. However, statistical significance is limited in several ways. Firstly, it does not provide any information about the magnitude of the difference between the two treatments/measures/groups being assessed in the study (i.e. how much more effective the treatment was compared to no treatment- as referenced in the above section). In addition, with a large enough sample, most studies will often produce statistically significant results even when the intervention or treatment has only small effects. Small effects, even if significant, may often have little clinical utility. Lastly, statistical significance cannot be compared across studies, which limits our ability to compare the results of different treatments (for example) across different studies.

Common effect sizes

  • Cohen’s d – measures the effect size between two groups; is commonly used in meta-analysis.
  • Hedge’s g – similar to Cohen’s d, however preferred when sample sizes are very small (e.g. <20 people).
  • Odds ratio (OR) – reflects the odds of a desired outcome in the intervention group relative to the odds of a similar outcome in the control group.
  • Relative risk (RR) – reflects the probability of an event occurring (e.g. developing a disease) in an exposed group, compared to this same event occurring in a non-exposed group.
  • Pearson’s r – measures the strength and direction of a correlation between two variables.

How do I interpret an effect size?

While there are no set definitions for interpreting effect sizes; some values have been offered cautiously as a guideline or “rule of thumb”. For example, Cohen (1988) proposed that a d of .2 indicates a small effect, while a d of .5 indicates a medium effect, and a d of .8 indicates a large effect. However, it is important when interpreting an effect size to refer to prior studies to see where your findings fit into the wider literature, and to also consider the methodological quality of the study, and the clinical significance of the findings (i.e. has the intervention resulted in a meaningful change in the participants’ lives?).

Helpful websites

  • Statistics How To: a website with simple, plain-language explanations of common statistical procedures. To view their Hedge’s g guide, please click here.
  • Research Rundowns: "uncomplicated reviews of educational research methods". To view their effect size guide, please click here.

References and further reading

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. 2nd edn.

Durlak, J. (2009). How to Select, Calculate, and Interpret Effect Sizes, Journal of Pediatric Psychology, 34(9), 917–928, https://doi.org/10.1093/jpepsy/jsp004

Effect Size. Retrieved from: https://researchrundowns.com/quantitative-methods/effect-size/

Hedge’s g: Definition, Formula (2017, October 31). Retrieved from: http://www.statisticshowto.com /hedges-g/

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology4, 863. http://doi.org/10.3389/fpsyg.2013.00863

Sullivan, G. M., & Feinn, R. (2012). Using Effect Size—or Why the P Value Is Not Enough. Journal of Graduate Medical Education4(3), 279–282. http://doi.org/10.4300/JGME-D-12-00156.1

affiliations 4