The school research lead and RCTs - what can a systematic review tell us?

As a school research lead one of the things that you will have to get grips with is the debate over whether randomised controlled trials (RCTs) can make any meaningful contribution to understanding ‘what works’ in educational settings. Helpfully Connolly, Keenan, et al. (2018) have recently had published systematic review on the use of RCTs in education, which seeks to address four key criticisms of RCTs: it is not possible to undertake RCTs in education; RCTs are blunt research designs that ignore context and experience; RCTs tend to generate simplistic universal laws of ‘cause and effect’; and that they are inherently descriptive and contribute little to theory. So in this rest of this post I will provide extracts from the systematic review, examine the review’s answer to the questions posed, identify some missed opportunities, and finally, make some comments about RCTS and the work of school research leads.

Connolly, et al. (2018) systematic review - extracts

The systematic review found a total of 1017 unique RCTs that have been completed and reported between 1980 and 2016, with three quarters of these being produced over the last 10 years.

Over half of all RCTs identified were conducted in North America and a little under a third in Europe.

The RCTs cover a wide range of educational settings and focus on an equally wide range of educational interventions and outcomes.

Connolly et al go onto argue that the review: provides clear evidence to counter the claim that it is just not possible to do RCTs in education. As has been demonstrated, there now exist over 1000 RCTs that have been successfully completed and reported across a wide range of educational settings and focusing on an equally wide range of interventions and outcomes. Whilst there is a clear dominance of RCTs from the United States and Canada, there are significant numbers conducted across Europe and many other parts of the world. Many of these have been relatively large-scale trials, with nearly a quarter (248 RCTs in total) involving over one thousand participants. Moreover, a significant majority of the RCTs identified (80.8%) were able to generate evidence of the effects of the educational interventions under investigation.

As noted earlier, these figures are likely to be under-estimates given the limitation of the present systematic review, with its restricted focus on articles and reports published in English. Nevertheless, the evidence is compelling that it is quite possible to undertake RCTs in educational settings. Indeed, across the 1017 RCTs identified through this systematic review, there are almost 1.3 million people that have participated in an RCT within an education setting between 1980 and 2016 ……

Secondly, there is some evidence to counter the criticism that RCTs ignore context and experience. Whilst they only constitute a minority of the trials identified (37.7%), there were 381 RCTs found that included a process evaluation component…..

Thirdly, there is more evidence to suggest that the RCTs produced within the time period
have attempted to avoid the generation of universal laws of ‘ cause and effect’ . Certainly, those RCTs identified that have included at least some subgroup analyses would suggest a more nuanced approach amongst those conducting RCTs, that acknowledges that educational interventions are not likely to have the same effect across all contexts and all groups of students. Moreover, this is clearly evident amongst the majority of RCTs reported (77.9%) that included at least some discussion of and reflections on the limitations of the findings in terms of their generalizability….

…. (in) relation to the fourth criticism regarding the a theoretical nature of RCTs, this is also challenged to some extent by the findings presented above. A clear majority of RCTs that were reported included some discussion of the theory underpinning the interventions under investigation (77.3%). Moreover, a majority of RCTs (60.5%) also provided some reflections on the implications of their findings for theory….

Overall, the findings from this systematic review of RCTs undertaken in education 1980 – 2016 are mixed. On the one hand, there is clear evidence that it is possible to conduct RCTs in education, regardless of the nature of the education setting or of the particular type and focus of the intervention under consideration….

On the other hand, it is perhaps not surprising that criticisms of RCTs continue when nearly two thirds of RCTs in this period of time have not included a process evaluation component and where nearly half of them have not looked beyond the overall effects of the intervention in question for the sample as a whole. Similarly, it is difficult to challenge the view that RCTs promote a simplistic and atheoretical approach to educational research when nearly 40% of trials in this analysis have failed to reflect upon the implications of their findings for theory.

Have the criticisms of RCTs been answered?

The main assumption of the systematic review is that it is possible to confirm whether RCTS can be carried out in education by merely counting the number of RCTs – subject to certain criteria – have been carried out. However, this does not give you an answer as to the question as to whether it’s possible to carry out RCTS in an educational context. It merely tells you, that a number of RCTS have been carried out by researchers and commissioners of research who believe it is possible to carry out RCTS within education.

Missed opportunities

One of the first things that struck me about the systematic review was that it appeared to give very little attention, if any, as to whether any of the RCTs had systematic flaws in their design. It may well be that over a 1000 RCTs have been conducted between 1980 and 2016 but a recent review of RCTs by Ginsburg and Smith (2016) reported that 27 RCTs that met the minimum standards of the What Works Clearinghouse based in the US, found that 26 of the RCTS had serious threats to their usefulness. This would suggest that may be only 40 or so of the RCTS included in Connolly’s review did not have some kind of the serious threat to their trustworthiness. Second, we then need to look at how many of these 40 or so RCTs included a process evaluation, made a contribution to theory and did not seek to overgeneralise. It’s likely to a be very small percentage of the number RCTS which had been carried out. In other words, out of 1000 plus RCTs how many did not have fundamental flaws in design or some other failing.

What does this all mean for you as a school research lead?

For a start, Connolly et al provide an accessible introduction to a discussion of the issues associated with RCTs. As such, it’s worth spending time reading the review.

Second, the review highlights some of the issues associated with the hierarchies of evidence. Both systematic reviews and RCTS appear to be near the top of hierarchies of evidence. However, what matters more is whether the research design is applicable to the research question at hand, Gorard, See, et al. (2017) and Sharples (2017).

Third, when reading RCTs it is necessary to have some kind of check-list so that you judge the trustworthiness of what you are reading. In my forthcoming book, Evidence-Based School Leadership: A practical guide, Jones (2018) - I try and do this by combing the work of Gorard, et al. (2017) and Ginsburg and Smith (2016). For example, look out for whether the intervention is being evaluated by the researchers who developed the intervention.

Fourth, setting aside the issue of whether it’s possible to conduct an RCT within education, assuming that it is, theirs is the issue about how the analysis of the results is carried out, and whether p-values and statistical significance have been misused. This is not the time or place to go into detail about this debate – that said I would recommend you have a look at Wasserstein and Lazar (2016) – and the American Statistical Association’s statement on p-values - Gorard, et al. (2017) with more adventurous readers having a look at Greenland, Senn, et al. (2016). If on the other hand, you fancy having a look at something which is far more accessible, I recommend you have a look at Richard Selfridge’s new book – Databusting for Schools.

And finally

RCTS within educational research are not going away any time in the near future and I hope this post has provided a little more clarity


Connolly, P., Keenan, C. and Urbanska, K. (2018). The Trials of Evidence-Based Practice in Education: A Systematic Review of Randomised Controlled Trials in Education Research 1980–2016. Educational Research.
Ginsburg, A. and Smith, M. (2016). Do Randomized Controlled Trials Meet the “Gold Standard”? American Enterprise Institute
Gorard, S., See, B. and Siddiqui, N. (2017). The Trials of Evidence-Based Education. London. Routledge
Greenland, S., Senn, S., Rothman, K., Carlin, J., Poole, C., Goodman, S. and Altman, D. (2016). Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations. European journal of epidemiology. 31. 4. 337-350.
Jones, G. (2018). Evidence-Based School Leadership and Management: A Practical Guide. London. SAGE Publishing.
Selfridge, R. (2018) Databusting for Schools, London, SAGE
Sharples, J. (2017). A Visions for an Evidence-Based Education System - or Some Things I'd Like to See. London. Education Endowment Foundation
Wasserstein, R. and Lazar, N. (2016). The Asa's Statement on P-Values: Context, Process, and Purpose, the American Statistician, 70:2, 129-133,. The American Statistician. 70. 2. 129-133.