The Big Evidence Debate, effect sizes and meta-analysis - Are we just putting lipstick on a pig?

Tuesday 4 June saw a collection of ‘big name’ educational researchers – Nancy Cartwright, Rob Coe, Larry Hedges, Steve Higgins and Dylan Wiliam – coming together with teachers, researchers and knowledge brokers for the ‘big evidence debate’ about the extent to which the metrics from good quality experiments and meta-analyses can really help us improve education in practice and is meta-analysis the best we can do?

Now if you turned up for the ‘big evidence debate’ expecting academic ‘rattles’ and ‘dummies’ to be thrown out of the pram, then you were going to be disappointed, as it became quickly apparent that there was a broad consensus amongst the majority of the presenters, with this consensus – and I’ll come back to a minority view later – being summarised along the lines of:

• Most of the criticisms made of effect sizes by scholars such as Adrian Simpson – a notable absentee from those participating in the debate – have been known about by researchers for the best part of forty years.

• Many of the studies included in meta-analyses are of a low quality and more high quality and well-designed educational experiments are needed, as they form the bedrock of any meta-analysis or meta-meta analysis.

• Just because someone is a competent educational researcher does not make someone competent at undertaking a systematic review and associated meta-analysis.

• It’s incredibly difficult, if not impossible, for even the well-informed ‘lay-person’ to make any kind of judgement about the quality of a meta-analysis.

• There are too many avoidable mistakes being made when undertaking educational meta-analyses – for example - inappropriate comparisons; file-drawer problems; intervention quality; and variation in variability

• However, there are some problems in meta-analysis in education which are unavoidable; aptitude x treatment interactions; sensitivity to instruction; and the selection of studies.

• Nevertheless, regardless of how well they are done, we need to get better at communicating the findings arising from meta-analyses so that they are not subject to over-simplistic interpretations by policymakers, researchers, school leaders and teachers.

Unfortunately, there remains a nagging doubt – if we do all these things – better original research, better meta-analysis and better communication of the outcomes – then all that we may be doing – ‘is putting lipstick on a pig’. In other words, even if make all these changes and improvements in meta-analysis, they in themselves do not tell practitioners much, if anything, about what to do in their own context and setting. Indeed, Nancy Cartwright argued that whilst a randomised controlled trial may tell you something about ‘what worked’ there and a meta-analysis may tell you something about what worked in a number of places, they cannot tell you anything about whether ‘what worked there’ will ‘work here’ . She then goes onto use the image of randomised controlled trials and meta-analysis as being ‘like the small twigs in a bird’s nest. A heap of these twigs will not stay together in the wind. But they can be sculpted together is a tangle of leaves, grass, moss, mud, saliva, and feathers to build a secure nest.’ Cartwright, 2019 p13

As such, randomised controlled trials and meta-analyses should be a small proportion of educational research and should not be over-invested in. Instead, a whole range of activities should be engaged in, for example, case studies, process tracing, ethnographic studies, statistical analysis and the building of models.

Given the above what are the implications for teachers of the ‘big evidence debate’? Well if we synthesise the recommendations for teachers from both Dylan Wiliam and Nancy Cartwright, it’s possible to come up with six questions teachers and school leaders should ask when trying to use educational research to bring about improvement in schools.

1. Does this ‘intervention’ solve a problem we have?

2. Is our setting similar enough in ways that matter to other settings in which the intervention appears to have worked elsewhere?

3. What other information can we find – be it from other fields and disciplines outside of education, your own knowledge of your school and pupils – so you can derive your own causal model and theory of change of how the intervention could work?

4. What needs to be in place for the theory of change to work in our school?

5. How much improvement will we get? What might get in the way of the intervention so that good effects are negligible? Will other things happen to make the intervention redundant?

6. How much will it cost?


Further reading

Nancy Cartwright (2019): What is meant by “rigour” in evidence-based

educational policy and what’s so good about it?, Educational Research and Evaluation, DOI:


Steven Higgins (2018): Improving Learning: Meta-analysis of Intervention Research in Education. Cambridge University Press, Cambridge, UK

Adrian Simpson (2019): Separating arguments from conclusions: the mistaken

role of effect size in educational policy research, Educational Research and Evaluation, DOI:


Dylan Wiliam (2019): Some reflections on the role of evidence in improving

education, Educational Research and Evaluation, DOI: 10.1080/13803611.2019.1617993