The myth of a 0.4SD effect size and a year's worth of a progress

In the last week you will probably have been in a staff development session where the presenter – be it a senior leader, school research lead or consultant – will have made some reference to effect size.  Indeed, there is a very strong chance - particularly if the speaker is an advocate of work of Professor John Hattie and Visible Learning - that they will make reference to a 0.4 SD effect size as the average expected effect size for one year of progress in school (Hattie, 2015).  In other words, over the course of an academic you should expect your pupils to make at least 0.4 SD of progress. Unfortunately, although there is some appeal in having some form of simple numerical measure to represent a year’s worth of progress, it is not quite that simple and is potentially highly misleading. 

(Wiliam, 2016) states that when working out the standardised effect size in an experiment, this is quite simple the difference between the mean of the experimental group and the mean of the control group, divided by standard deviation of the population.  However, as the standard deviation for the achievement of older pupils tends to be greater than for younger pupils, this means that with all other things being equal you would expect small standardised effect sizes for experiments involving older pupils than for experiments with younger pupils.

Wiliam then goes onto cite the work of (Bloom, Hill, Black, and Lipsey, 2008) which looked at the annual progress made by pupils.  Using a number of standardised assessments, Bloom et al looked at the differences in scores achieved by pupils from one year to the next, and then divided this by the pooled standard deviations – which allowed to calculate the effect size for a year’s worth of teaching. They found that for six-year olds a year’s worth of growth is approximately 1.5 standard deviations, whereas for twelve year olds a year’s worth of growth was 0.2 standard deviations.  As such, although average growth for school pupils may be approximately, 0.4 standard deviations, this average is largely meaningless and has little or no value.

Elsewhere, in the Sutton Trust and EEF’s Teaching and Learning  Toolkit manual (EEF, 2018)they use the assumption that 1 year’s worth of progress is equivalent to one standard deviation.  However, the EEF recognise that the notion of one standard deviation representing one year’s progress does not hold for all ages.  For example, data from National Curriculum tests indicates annual progress of about 0.8 of a standard deviation at age 7, falling to 0.7 at 11 and 0.4 at age 14.

In another study, (Luyten, Merrell and Tymms, 2017)looked at the impact of schooling on 3500 pupils in Year 1 – 6 in 20 – predominately private sector - English primary schools.  They found that the year to year gains of schooling declined as pupils got older.  For example, for the youngest pupils the effect size for progress in Reading was 1.15 standard deviations, whereas for the oldest pupils the effect size for year to year progress was 0.49 standard deviations. This declining trend in effect size was also seen for measures of Mental Maths and Developed Ability, although General Maths deviates from this pattern, with effect sizes in general being consistent from year to year. 

Discussion and implications

First, it’s important to remember that this analysis has focussed on year to year progress made by groups pupils.  It has not looked at the impact of specific interventions or group effect sizes.  As the 0.4 SD effect size for one year’s progress, should not be confused or conflated with the 0.4 SD effect size put forward by Professor Hattie as the average effect size for factors influencing education.

Second, if the presenter of your staff development session is not aware of the issues raised by this post – you may want to very professionally point them in the direction of the references listed at the end of this post.  This is not about embarrassing senior colleagues or guests by saying that they are wrong, but more it’s about saying that the claims they are making are not uncontested. 

Third, given how effect size vary with both age and the diversity of population, it suggests that any attempts by teachers to make judgments about their effectiveness of teaching by calculating effect sizes may be seriously flawed.  For primary school teachers there is a risk that they will overestimate their effectiveness, whereas for secondary school teachers the opposite is true. Indeed, given some critical issues with effect sizes – see (Simpson, 2017)and (Simpson, 2018)it’s probably wise for individual teachers to steer clear of their calculation.

Finally, this blog raises all sorts of issues about when to trust the experts (Willingham, 2012).  In this blog you have an edublogger challenging the claims made by world renowned educational researcher.  It may be that I have misunderstood the claims by Professor Hattie.  It maybe that I have misunderstood the arguments of Dylan Wiliam, Hans Luyten, Steve Higgins the EFF and others.  However, what it does suggest is that it is maybe unwise to rely upon a single expert, instead, particularly in education it’s worth making sure your evidence-informed practice is influenced by a range of experts. 


EEF (2018) Sutton Trust - EEF Teaching and Learning Toolkit & EEF Early Years Toolkit - Technical Appendix and Process Manual (working document v.01).

Hattie, J. (2015) What doesn’t work in education: The politics of distraction. Pearson.

Kraft, M. A. (2018) Interpreting effect sizes of education interventions. Brown University Working Paper. Downloaded Tuesday, April 16, 2019, from ….

Luyten, H., Merrell, C. and Tymms, P. (2017) ‘The contribution of schooling to learning gains of pupils in Years 1 to 6’, School effectiveness and school improvement. Taylor & Francis, 28(3), pp. 374–405.

Simpson, A. (2017) ‘The misdirection of public policy: comparing and combining standardised effect sizes’, Journal of Education Policy. Routledge, pp. 1–17. doi: 10.1080/02680939.2017.1280183.

Simpson, A. (2018) ‘Princesses are bigger than elephants: Effect size as a category error in evidence‐based education’, British Educational Research Journal. Wiley Online Library, 44(5), pp. 897–913.

Wiliam, D. (2016) Leadership for teacher learning. West Palm Beach: Learning Sciences International.

Willingham, D. (2012) When can you trust the experts: How to tell good science from bad in education. San Francisco: John Wiley & Sons.

Thousands of days of INSET and a deluge references to effect sizes

Please note - since this blogpost was published I have come across an updated version of Professor Kraft’s paper which can be found here

I’ll be posting an update to this blogpost in the coming weeks

In England there are approximately 24,000 schools – which means that next week will see thousands of INSET/CPD days taking place.  In all likelihood, in a great number of those sessions, someone leading the session, will make some reference to effect sizes to compare different educational interventions   However, as much it might be appealing to use effect sizes to compare interventions, this does not mean that just comparing effect sizes tells you anything useful about the importance of the intervention for you and your school. So to help you get a better understanding of how to use effect sizes I’m going to draw upon the work of (Kraft, 2018 )who has devised a range of questions that will help you interpret the effect size associated with a particular intervention.  But as it is the start of the academic year – it might be useful to first revisit how an effect size is calculated and some existing benchmarks for interpreting effect sizes.  

Calculating an effect size

In simple terms, an effect size is a ‘way of quantifying the difference between two groups’p339. (Coe, 2017)and can be calculated by using the following formula

Effect size =  ((Mean of experimental group) – (Mean of control group))/Pooled standard deviation

To illustrate how an effect size is calculated makes Coe reference to the work of  (Dowson, 2002) who attempted to demonstrate the time of day effects on children’s learning, or in other words do children learn better in the morning or afternoon? 

  • Thirty-eight children were included in the intervention with half being randomly allocated to listen to a story and respond to questions, at 9.00 am., whereas the remaining 19 students listened to the same story and questions at 3.00 am. 

  • The children’s understanding of the story was assessed by using a test where the number of correct answers was measured out of twenty.  

  • The morning group had an average of score of 15.2, whereas the afternoon group had an average score 17.9, a difference of 2.7.  

  • The effect size of the intervention can now be calculated (17.9-15.2)/3.3 equals 0.8 SD.

Benchmarks for interpreting effect sizes

A major challenge when interpreting effect sizes is that there is no generally agreed scale. One of the most widely used set of benchmarks comes from the work of Jacob Cohen (Cohen, 1992) who defines small, medium and large effect sizes as 0.2, 0.5 and 0.8 SD respectively.  However, this effect size scale was derived to identify the sample size required to give yourself a reasonable chance of detecting and effect size of that size, if it existed.  As (Cohen, 1988) notes: “The terms ‘small,’ ’medium,’ and ‘large’ are relative to each other, but to the area of behavioural science, or even more particularly to the specific content of the research method being employed in any given investigation.’ p25.  As such, Cohen’s benchmark should not be used to interpret the magnitude of an effect size. 

Alternatively you could use the ‘hinge-point’ of 0.4 SD put forward by John Hattie (Hattie, 2008)  who reviewed over 800 meta-studies and argues that the average effect size of all possible educational influences is 0.4 SD.  Unfortunately, as (Kraft, 2018)notes Hattie’s meta-analysis includes studies with small samples, weak research designs and proximal measurements  - all of which result in larger effect sizes.  As such, the 0.4SD hinge point is in all likelihood an over estimate of the average effect size. Indeed, (Lipsey et al., 2012) argue that based on empirical distributions of effect sizes from comparable studies an effect sizes of 0.25 SD in education research should be interpreted as large.  Elsewhere, (Cheung and Slavin, 2015) found that the average effect sizes for interventions ranged from 0.11 to 0.32SD depending upon sample size and the comparison group

You could also look at the work of (Higgins et al., 2013) and the Education Endowment Foundation’s Teaching Learning Toolkit who suggest that low, moderate, high and very effect sizes are -0.01 to 0.18, 0.19 - 0.44, 0.45 - 0.69, 0.7SD + respectively.  However, it’s important to note is based on the assumption that the effect size of a year’s worth of learning at elementary school is 1 SD – yet (Bloom et al., 2008) found that for six-year olds a year’s worth of growth is approximately 1.5 standard deviations, whereas for twelve-year olds a year’s worth of growth was 0.2 standard deviations.  

Finally, you could refer to the work of (Kraft, 2018)who undertook an analysis of 481 effect sizes from 242 RCTs of education interventions with achievement outcomes and came up with the following effect size benchmarks for school pupils: less than 0.05 is Small to less than 0.20 is Medium and, 0.20 or greater is large. Nevertheless, as Kraft himself notes ‘these are subjective but not arbitrary benchmarks.  They are easy heuristics to remember that reflect the findings of recent meta-analyses.’ p18.

Kraft’s Guidelines for interpreting effect sizes

The above would suggest that attempting on interpret effect sizes by the use of standardised benchmarks is not an easy task – different scales suggest that large effect sizes range from 0.2 to 0.8 SD.  As such, if we go back to the 0.8SD effect size Dowson found when looking at the time of day effects on pupil does this mean we have found an intervention with a large-effect size, which you and your school should look to implement. Unfortunately, as much as you might like to think so, it’s not that straightforward. Effect sizes are not just determined by the effectiveness of the intervention but by a range of  other factors – see (Simpson, 2017) for a detailed discussion.  Fortunately, (Kraft 2018) has identified a number of questions that you can ask to help you interpret effect sizes and this and in Table 1 we will now apply these questions to Dowson’s findings 

Screenshot 2019-08-28 at 18.12.28.png
Screenshot 2019-08-28 at 18.13.01.png

As such, given the nature of the intervention and in particular given both the relatively short period of time between intervention and the  measurement of the outcomes and the outcomes being closely aligned to the intervention, we should not be overly surprised that we have an effect size which might ‘at first-blush’ be interpreted as large. 

Implications for teachers, school research leads and school leaders

One, it is necessary to extremely careful to avoid simplistic interpretations of effect sizes. In particular, where you see Cohen’s benchmarks being used, this should set off the alarm bells about the quality of the work you are reading.

Two, when interpreting the effect-size of an intervention – particularly in single studies where the effect size is greater than 0.20SD it’s worth spending a little time in applying Kraft’s set of questions – to see if there are any factors which are contributing to upward pressures on the resulting effect size.

Three, when making judgments about an intervention – and whether it should be introduced into your school – the effect size is only piece of the jigsaw.  Even if an intervention has a relatively small effect size, the intervention may still be worth implementing if the costs are relatively small, the benefits are quickly realised, and it does not require a substantial change in teachers’ behaviour 

Last but not least, no matter how large an effect size of an intervention – what matters are the problems that you face in your classroom, department or school.  Large effect sizes for interventions that will not solve a problem you are faced with, are for you, largely irrelevant.


Bloom, H. S. et al.(2008) ‘Performance trajectories and performance gaps as achievement effect-size benchmarks for educational interventions’, Journal of Research on Educational Effectiveness, 1(4), pp. 289–328.

Cheung, A. and Slavin, R. E. (2015) ‘How methodological features affect effect sizes in education’, Best Evidence Encyclopedia, Johns Hopkins University, Baltimore, MD. Available online at: http://www. bestevidence. org/word/methodological_Sept_21_2015. pdf (accessed 18 February 2016).

Coe, R. (2017) ‘Effect size’, in Coe, R. et al. (eds) Research Methods and Methodologies in Education (2nd edition). London: SAGE.

Cohen, J. (1988) ‘Statistical power analysis for the behavior science’, Lawrance Eribaum Association.

Cohen, J. (1992) ‘A power primer’, Psychological bulletin, 112(1), p. 155.

Dowson, V. (2002) Time of day effects in schoolchildren’s immediate and delayed recall of meaningful materialTERSE Report. CEM, University of Durham. Available at:

Hattie, J. (2008) Visible learning: A synthesis of over 800 meta-analyses relating to achievement. London: Routledge.

Higgins, S. et al.(2013) The Sutton Trust-Education Endowment Foundation Teaching and Learning Toolkit Manual. London: Education Endowment Foundation. Available at: internal-pdf://

Kraft, M. A. (2018) Interpreting effect sizes of education interventions. Brown University Working Paper. Downloaded Tuesday, April 16, 2019, from ….

Lipsey, M. W. et al.(2012) ‘Translating the Statistical Representation of the Effects of Education Interventions into More Readily Interpretable Forms.’, National Center for Special Education Research. ERIC.

Simpson, A. (2017) ‘The misdirection of public policy: comparing and combining standardised effect sizes’, Journal of Education Policy. Routledge, pp. 1–17. doi: 10.1080/02680939.2017.1280183.

Useful sources of information for teachers, school research leads and senior leaders

No doubt as the summer holidays draw to a close and the new term approaches, there will be teachers, school research leads and senior leaders who will be preparing to deliver a start of term INSET/CPD session, which might have as a focus, evidence-informed practice. So to help those colleagues with their preparation for such as session, I thought it might be useful to share a range of resources - books, blogs, resources available online and institutional webites which colleagues might find useful in their preparations. It’s not an exhaustive list of resources, on the other hand, it might point you in the direction of something which helps you deliver a session which has real value for colleagues. So here goes:


Ashman, G. (2018) The Truth about Teaching: An evidence-informed guide for new teachers. London: SAGE.

Barends, E. and Rousseau, D. M. (2018) Evidence-based management: How to use evidence to make better organizational decisions. London: Kogan-Page.

Brown, C. (2015) ‘Leading the use of research & evidence in schools’. London: IOE Press.

Cain, T. (2019) Becoming a Research-Informed School: Why? What? How? London: Routledege.

Didau, D. (2015) What if everything you knew about education was wrong? Crown House Publishing.

Hattie, J. and Zierer, K. (2019) Visible Learning Insights. London: Routledge.

Higgins, S. (2018) Improving Learning: Meta-analysis of Intervention Research in Education . Cambridge : Cambridge University Press.

Kvernbekk, Tone. 2015. Evidence-Based Practice in Education: Functions of Evidence and Causal Presuppositions. London: Routledge.

Netolicky, D. 2019. Transformational Professional Learning: Making a Difference in Schools. London: Routledege.

Petty, G. (2009) Evidence-based teaching: A practical approach. Nelson Thornes.

Weston, D. and Clay, B. (2018) Unleashing Great Teaching: The Secrets to the Most Effective Teacher Development. Routledge.

Wiliam, D. (2016) Leadership for teacher learning. West Palm Beach: Learning Sciences International.

Willingham, D. (2012) When can you trust the experts: How to tell good science from bad in education. San Francisco: John Wiley & Sons.


Rebecca Allen

Christian Bokhove

Larry Cuban

Centre for Evaluation and Monitoring

Harry Fletcher-Wood

Blake Harvard

Ollie Lovell

Alex Quigley

Tom Sherrington

Robert Slavin

Other resources

Barwick M. (2018). The Implementation Game Worksheet. Toronto, ON The Hospital for Sick Children

CEBE (2017) ‘Leading Research Engagement in Education : Guidance for organisational change’. Coalition for Evidence-Based Education.

CESE (2014) ‘What Works Best: Evidence-based practice to help NSW student performance’. Sydney, NSW: Centre for Education Statistics and Evaluation.

CESE (2017) ‘Cognitive Load Theory: Research that teachers need to understand’. Sydney, NSW: Centre for Education Statistics and Evaluation

Coe, R, and S Kime. 2019. “A (New) Manifesto for Evidence-Based Education: Twenty Years On.” Sunderland, U.K.: Evidence-Based Education 

Coe, R. et al.(2014) What makes great teaching? Review of the underpinning research. London: Sutton Trust.

Deans for Impact (2015) ‘The Science of Learning’. Austin, TX: Deans for Impact.

Dunlosky, J. (2013) ‘Strengthening the student toolbox: Study strategies to boost Learning.’, American Educator. ERIC, 37(3), pp. 12–21.

IfEE (2019) ‘Engaging with Evidence’. York: Institute for Effective Education. 

Metz, A. & Louison, L. (2019) The Hexagon Tool: Exploring Context. Chapel Hill, NC: National Implementation Research Network, Frank Porter Graham Child Development Institute, University of North Carolina at Chapel Hill. Based on Kiser, Zabel, Zachik, & Smith (2007) and Blase, Kiser & Van Dyke (2013).

Nelson, J. and Campbell, C. (2017) ‘Evidence-informed practice in education: meanings and applications’, Educational researcher, 59(2), pp. 127–135. 

Rosenshine, B. (2012) ‘Principles of Instruction: Research based principles that all teachers should know’. Spring 2012: American Educator.

Stoll, al.(2018) ‘Evidence-Informed Teaching: Self-assessment tool for teachers’. London, U.K.: Chartered College of Teaching

Useful websites

Best Evidence in Brief Fortnightly newsletter which summarises some of the most recent educational research 

Best Evidence Encyclopaedia The Best Evidence Encyclopaedia is a web site created by the Johns Hopkins University School of Education and provides summaries of scientific reviews and is designed to give educators and researchers fair and useful information about the evidence supporting a variety of teaching approaches for school students

Campbell Collaboration The Campbell Collaboration promotes positive social and economic change through the production and use of systematic reviews and other evidence synthesis for evidence-based policy and practice – 38 of which have been produced for education

Chartered College of Teaching The professional association for teachers in England – provides a range of resources for teachers interested in research-use 

Deans for Impact A group of senior US teacher educators who are committed to the use research in teacher preparation and training

Education Endowment Foundation Guidance Reports Provides a range of evidence-based recommendations for how teachers can address a number of high priority issues

Education Endowment Foundation Teaching and Learning Toolkit  A summary of the international evidence on teaching and learning for 5 -16-year olds

 EPPI-Centre Based at the Institute of Education, University College London – the EPPI Centre is a specialist centre for the development and conduct of systematic reviews in social science

Evidence for Impact Provides teachers and school leaders with accessible information on which educational interventions have been shown to be 

Institute for Education Sciences  The Institute of Education Sciences (IES) is the statistics, research, and evaluation arm of the U.S. Department of Education, whose role is to provide scientific evidence on which to ground education practice and policy and to share this information in formats that are useful and accessible

 Research Schools Network A group of 32 schools in England – supported by the Education Endowment Foundation and the Institute of Effective Education – who support the use of evidence to improve teaching practice. 

Teacher Development Trust   Provides access to resources for teachers interested in research use and continuous professional development 

The Learning Scientists A US based group of cognitive scientists who We are cognitive psychological scientists are interested in the science of learning and who want to make scientific research on learning more accessible to students, teachers, and other educators.

What Works Clearinghouse Part of the IES – the What Works Clearinghouse – reviews educational and determine which studies meet rigorous standards, and summarize the findings, so as to question “what works in education

And remember

Just because a writer, text or organisation appears on the above lists, you still need to critically engage with what is said/written. You still need to ask: What is it? Where did I find it? Who has written/said this? When was this written/said? Why has this been written and/said?How do I know if it is of good quality? (Aveyard, Sharp, and Woolliams 2011)


Aveyard, H., Sharp, P. and Woolliams, M. (2011) A beginner’s guide to critical thinking and writing in health and social care. Maidenhead, Berkshire: McGraw-Hill Education (UK).

The school research lead, The Implementation Game and increasing your chance of successfully implementing an intervention

In last week’s blog we looked at school leaders could use the Hexagon Tool to help them better decisions as to whether a particular intervention is right for their school and setting.  In this week’s blog I’m going to look at what comes next – the implementation of the intervention – and how the work of Melanie Barwick and The Implementation Game (TIG) can increase your chance of actually bringing about improvements for your pupils and staff

Put simply The Implementation Game is basically a resource that helps you develop and implementation plan for whatever intervention you are looking to introduce.    Based on the research evidence from the field of implementation science – TIG is ‘played’ by the group of people who will be helping you develop the implementation of the of the intervention. In particular, it gets the implementation team to think about five different stages of implementation.

·     Preparing for practice change – choosing an innovation – and, for example,questions around your needs, desired outcomes, potential evidence-based practices which could achieve those outcomes

·     Preparing for practice change – readiness – whether the proposed innovation meets yours needs, is it a good fit, what changes will need to be made, what resources are available, what capacity is available to sustain the innovation, how will you obtain and maintain buy-in, how will you communicate the goal of the innovations

·     Implementation structure and organization – what partnerships will be required, what training will be required, what physical space will be needed, who will you maintain fidelity to both the implementation process and fidelity to the innovation, what technology will be needed, 

·     Ongoing implementation support - what staff training will be provided, what technical assistance and coaching will be made available, what data will you collect to evaluate process and outcomes, how will you go about learning how to improve your processes

·     Maintaining fidelity and sustaining – how will you maintain fidelity and quality over time  

In addition, TIG provides a range of other resources which helps you think through 

·     The different factors that might be relevant for your intervention – for example, the characteristics of the intervention, the outer setting and external factors, the inner setting and internal factors, characteristics of individuals involved and the process of engaging with them.

·     Implementation strategies – gather information, building buy-in, developing relationships, developing training materials, financial strategies and incentives, quality management

·     Implementation outcomes – for example acceptability, adoption, appropriateness, costs, feasibility, fidelity, 

A few observations

It seems to me that TIG is a useful tool that help you engage in a rigours process of planning the implementation of an intervention.  However, that does not mean that by using the tool this will guarantee success – that will depend upon many factors both your skills in both using the TIG and subsequently implementing the identified actions.  Indeed, one thing that I really like about the tool is right from the beginning it’s getting you to think about the sustainability of the intervention – and it’s not just about how can we implement an innovation – and then tick a box and say job done.  

And finally 

This will be my last blog of the academic year – and I intend to return will new resources and material at the end of August


Barwick M. (2018). The Implementation Game Worksheet. Toronto, ON The Hospital for Sick Children.

The school research lead, the hexagon tool and making good decisions about implementing interventions

As we approach the end of the academic year, you will no doubt be giving some thought to what new practices or interventions that you wish to adopt this coming September. Unfortunately, we know that once implemented many of these interventions will not live up to their initial promise – maybe the evidence supporting the intervention was not that robust and the intervention’s benefits were overstated– maybe their isn’t the external or internal expertise available to support the implementation of the intervention – maybe the intervention doesn’t fit with other processes and practices within the setting – maybe the intervention runs counter to the existing school culture and is met with resistance from some of the people who need to implement it.

However, it might be possible to increase your chances of making sure that you choose to implement an intervention – that not only appears to work in other settings but has a good chance to work in yours. One way of increasing your chances of a successfully implementing an intervention is to make sure that before the intervention is implemented is that you undertake some form of structured evaluation of both the intervention and your setting. To help you do this, I’m going to suggest that you have a look at something known as the Hexagon Tool – Metz and Louison (2019) – which will help you undertake a structured appraisal of: the research evidence to back claims for the interventions effectiveness; of whether there is a clear and usable intervention which can be adapted to the local context; the support available to help implement the intervention; whether the intervention meets the needs of your school/setting; whether the intervention is a good fit with other processes and practices within your school setting; whether your school/setting has the capacity to implement the intervention.

Figure 1 The Hexagon Tool

Screenshot 2019-07-12 at 14.41.01.png

Metz and Louison go onto provide guidance on when to use the tool – ideally at the early stages of decision-making process of whether to adopt the intervention. They also provide guidance as to how to use the tool – and the tasks which needed to be completed before the actual use of the tool – and what needs to be done as the tool is being used.

Of particular, use is they provide both a set of questions and associated rating scale to help you make judgements about each of the six elements. For example, for the ‘evidence’ component they pose the following questions.

1. Are there research data available to demonstrate the effectiveness (e.g. randomized trials, quasi-experimental designs) of the program or practice? If yes, provide citations or links to reports or publications.

2. What is the strength of the evidence? Under what conditions was the evidence developed?

3. What outcomes are expected when the program or practice is implemented as intended? How much of a change can be expected?

4. If research data are not available, are there evaluation data to indicate effectiveness (e.g. pre/post data, testing results, action research)? If yes, provide citations or links to evaluation reports.

5. Is there practice-based evidence or community-defined evidence to indicate effectiveness? If yes, provide citations or links.

6. Is there a well-developed theory of change or logic model that demonstrates how the program or practice is expected to contribute to short term and long term outcomes?

7. Do the studies (research and/or evaluation) provide data specific to the setting in which it will be implemented (e.g., has the program or practice been researched or evaluated in a similar context?)?

If yes, provide citations or links to evaluation reports.

8. Do the studies (research and/or evaluation) provide data specific to effectiveness for culturally and linguistically specific populations? If yes, provide citations or links specific to effectiveness for families or communities from diverse cultural groups.

Which they suggest you use to make a rating judgment – which is based on the following 5 point scale.

5 High Evidence

The program or practice has documented evidence of effectiveness based on at least two rigorous, external research studies with control groups, and has demonstrated sustained effects at least one year post treatment

4 Evidence

The program or practice has demonstrated effectiveness with one rigorous research study with a control group

3 Some Evidence

The program or practice shows some evidence of effectiveness through less rigorous research studies that include comparison groups

2 Minimal Evidence

The program or practice is guided by a well-developed theory of change or logic model, including clear inclusion and exclusion criteria for the target population, but has not demonstrated effectiveness through a research study

1 No Evidence

The program or practice does not have a well-developed logic model or theory of change and has not demonstrated effectiveness through a research study

A few observations

A framework such as the Hexagon Tool is extremely helpful in getting you to think about the different aspects of implementing an intervention. Not only that, it does so in way which should allow to summarise your evaluation in a way which is easily communicable to others, with the use of the rating scale and maybe the use of a ‘spider digram.’ However, before you can make good use of the tool – you are probably going to have to make a few adjustments to some of the detailed descriptions of each of the elements and the associated questions – so that they reflect your context and system, rather than US system in which the tool was devised. In addition, it’s important to remember that the Hexagon Tool does not provide a substitute for your professional judgment and you will still need to make a decision as to whether or not to proceed with the intervention.

And finally

Tools like the Hexagon Tool are extremely useful in helping you organise your thinking but they are not a substitute for thinking about the intervention and whether ‘what worked there’ might in the right circumstances ‘work here.’


Metz, A. & Louison, L. (2019) The Hexagon Tool: Exploring Context. Chapel Hill, NC: National Implementation Research Network, Frank Porter Graham Child Development Institute, University of North Carolina at Chapel Hill. Based on Kiser, Zabel, Zachik, & Smith (2007) and Blase, Kiser & Van Dyke (2013).