Disciplined Inquiry, performance review and asking well-structured and formulated questions

Recently, I wrote about how disciplined inquiry was being used by some schools as a central part of their teacher performance review scheme. Now models of disciplined inquiry will often be based around some form of structured inquiry question, such as the one put forward by the Institute of Effective Education:

What impact does (what practice?) delivered (over how long?) have on (what outcome) for (whom?)?

Two examples of this type of inquiry question have been very helpfully provided by Shaun Allison and Durrington High School

• What impact does increasing the frequency of modelling writing, followed by structured metacognitive reflection in lessons delivered over a year have on the quality of creative writing for my two Y10 classes?

• What impact does explicitly teaching Tier 2 and 3 geographical vocabulary using knowledge organisers delivered over a year have on the appropriate use of tier 2/3 vocabulary in written responses for the disadvantaged students in my Y8 class?

However, given the diversity of teaching staff, it is unlikely that a single question structure is going to meet every teachers’ needs, interests or requirements. Furthermore, it is unlikely that a single question structure is likely to be sustainable over a number of years – with teachers losing enthusiasm for ‘disciplined inquiry’ when asked to do more of the same.

With this in mind, it’s probably worth examining a number of other formats for developing structured questions. One way of doing this is to use something known a conceptual tool known as PICO – and which is explained below

• Pupil or Problem - Who? - How would you describe the group of pupils or problem?

• Intervention - What or how? What are you planning to do with your pupils?

• Comparison - Compared to what? What is the alternative to the intervention – what else could you so?

• Outcome - Aim or objective(s)- What are you trying to achieve?

Sometimes additional elements are added to PICO, including C for context and the type of school, class or setting – or T for time – which relates to the time period it takes for the intervention to achieve the outcomes you are trying to achieve.

Although PICO is probably the most well used structure to formulate questions, there are a number of different variations which could be used. These alternatives are especially useful when you focus is not just on the outcomes for pupils, but consider other issues, such as, who are the stakeholders in the situation; from whose perspective are you looking at; and, how do pupils experience the intervention. Examples of these alternative frameworks include:

  • PESCICO Pupils, Environment, Stakeholders, Intervention, Comparison, Outcome

  • EPICOT Evidence, Pupils, Intervention, Comparison, Outcome, Time-period

  • PIE Pupils, Intervention, Experience/Effect

  • SPICE Setting, Perspective, Intervention, Comparator, Evaluation

  • PISCO Pupils, Intervention, Setting, Comparison, Outcome

  • CIMO Context, Intervention, Mechanism, Outcome

  • CIAO Context, Intervention, Alternative Intervention, Outcome

Let’s now look at worked examples for two the frameworks: PISCO and SPIDER


In this example, we are interested in whether ‘holding back’ Y8 pupils will have a beneficial impact on their learning outcomes.

  • Pupil or Problem -Who? Y8 pupils who have made insufficient progress

  • Intervention - What or how? Pupils will not progress to Y9 and will remain in Y8 and be provided with additional support

  • Setting Where ? - A secondary school in an inner-city

  • Comparison - Compared to what ? Progression to Y9

  • Outcome - Aim or objective(s) - For pupils to have caught up with pupils who progressed to Y9?


For this example, we are interested in the following question - What are Y7 pupils experience of transition from primary school to secondary school’

Sample of the population - Who? -Y7 pupils

PI – Pheonemna of interest- What’s taking place or happening? Pupils transition from Y6 to Y7

Design - Study Design - Interviews, focus groups and surveys

Evaluation Outcome measures Perceptions of support, expectations and attitudes towards school

Research - Type -Qualitative

What are the benefits of developing structured questions?

As a busy teacher, you may ask yourself whether it’s worth taking the time and effort to develop structured and well formulated questions. Unfortunately, there is little or no research which supports our claim for the benefits for teachers of such an approach – not for that matter that disciplined inquiry is an effective component of performance management. However, within the context of medicine and health-care seven potential benefits from the question formulation process have been identified - (Straus et al. 2011) – and which are likely to transfer to the setting of a school. These benefits include:

• Focusing your scarce professional learning time on evidence that is directly relevant to the needs of your pupils

• Concentrating professional learning time on searching for evidence that directly addresses your own requirements for enhanced professional knowledge.

• Developing time-effective search strategies to help you access multiple sources of relevant and useful evidence.

• Suggesting the forms that useful answers might look like.

• Helping you communicate more clearly when requesting support and guidance from colleagues

• Supporting your colleagues in their own professional learning, by helping them ask better questions

• Increases in level of job-satisfaction by asking well formulated questions which are then answered.

Tips for developing your question

There is no one preferred way for developing a question which forms the basis of your disciplined. However, there are a number of actions you can take which will increase the likelihood of developing a question that may lead to improvement in both your teaching and outcomes for pupils.

• Seek help from colleagues. If you school has a school research lead get their advice, they may help you refine your question or point you in the direction of colleagues who have looked into the same or similar question.

• Developing your question is an iterative process and your question will change as you discuss issues with colleagues, begin to explore the literature and your own thinking changes.

• Don’t be afraid to write down your question, even if in your mind, it will be incomplete or not yet formally formed. Keep a written record of your thinking as it develops

• Especially when developing a PICO or similar type question, particularly if you are new teachers you may find it difficult to identify both the intervention and comparator. At this stage you may want to focus on both the problem being encountered and the outcomes which you wish to bring about.

• When thinking about the comparator you might want to spend some time working on how you would describe ‘business as usual’ – as this is likely to be the comparator to whatever intervention is being considered.

• In all likelihood, for any problem you are trying to address, there will be more than one question you could ask. It will be useful to focus on a single question, when considering how to access different sources of evidence.

• Before committing any time and effort into trying to answer your well formulated questions think long and hard about whether the benefits from answering your question will outweigh the costs. Is your question: feasible, interesting, novel, ethical and relevant - (Hulley et al. 2013)


Hulley, Stephen B et al. 2013. Designing Clinical Research. Philadephia: Lippincott Williams & Wilkins.

Straus, S E, P Glasziou, S W Richardson, and B Haynes. 2011. Evidence-Based Medicine: How to Practice and Teach It. (Fourth Edition). Edinburgh: Churchill Livingstone: Elsevier.

We need to talk about RISE and evidence-informed school improvement - is there a crisis in the use of evidence in schools?

Recently published research - (Wiggins et al. 2019) - suggests that an evidence-informed approach to school improvement – the RISE Project – may lead to pupils to making small amounts of additional progress in mathematical and English compared to children in comparison schools. However, these differences are both small and not statistically significant so the true impact of the project may have been zero. Now for critics of the use of research evidence in schools, this may be indeed be ‘grist to their mill’ – with the argument being put forward that why should schools commit resources to an approach to school improvement which does not bring about improvements in outcomes for children. So where does that leave the proponents of research use in schools? Well I’d like to make the following observations, though I need to add these observations are made with the benefit of hindsight and may not have been obvious at the time.

First, the evidence-informed model of school improvement was new – so we shouldn’t be surprised if new approaches don’t always work perfectly first time. That doesn’t mean we should be blasé about the results and try and downplay them just because they don’t fit in with our view about the potential importance of the role of research evidence in bringing about school improvement. More thinking may need to be done to develop both robust theories of change and theories of action, which will increase the probability of success. Indeed, if we can’t develop these robust theories of change/action – then we may need to think again.

Second, the RISE Model is just one model of using evidence to bring about school improvement, with the Research Lead model being highly reliant on individuals within both Huntington School and the intervention schools. Indeed, the model may have been fatally flawed from the outset, as work in other fields, for example, (Kislov, Wilson, and Boaden 2017) suggesting that it is probably unreasonable to expect any one individual to have all the skills necessary to be a successful school research champion, cope with the different types of knowledge, build connections both within and outside of the school, and at the same time maintain their credibility with diverse audiences. As such, we need to look at different ways of increasing the collective capacity and capability of using research and other evidence in schools – which may have greater potential to bring about school improvement.

Third, the EEF’s school improvement cycle may in itself be flawed and require further revision. As it stands, the EEF school improvement cycle consists of five steps – decide what you want to achieve; identify possible solutions - with a focus on external evidence; give the idea the best chance of success; did it work; securing and spreading change by mobilising knowledge. However, for me, there are two main problems. First, at the beginning of the cycle there is insufficient emphasis on the mobilisation of existing knowledge within the school, with too much emphasis on external research evidence. The work of Dr Vicky Ward is very useful on how to engage in knowledge mobilisation. Second, having identified possible solutions the next step focusses on implementation, whereas there needs to be a step where all sources of evidence – research evidence, practitioner expertise, stakeholder views and school data – are aggregated and a professional judgment is made on how to proceed.

Fourth, some of the problems encountered – for example the high levels of turnover of staff being involved in a high-profile national project and using that as a springboard for promotion – were pretty predictable and should have been planned for at the start of the project.

Fifth, the project was perhaps over-ambitious in its scale – with over 20 schools actively involved in the intervention, and maybe the project would have benefitted from a small efficacy trial before conducting a randomised controlled trial. Indeed, there may need to be a range of efficacy trials looking at a range of different models for evidence-informed school improvement

Sixth, we need to talk about headteachers and their role in promoting evidence-informed practice in schools. It’s now pretty clear that headteachers have a critical role in supporting the development of evidence-informed practice (Coldwell et al. 2017) and if they are not ‘on-board’ then Research Leads are not going to have the support necessary for their work to be a success. Indeed, the EEF may need to give some thought not just to how schools are recruited to participate in trials but then to focus on the level of commitment of the headteacher to the trial – with a process being used to gauge headteacher commitment to research use in schools.

And finally

The EEF and the writers of the report should be applauded for the use of the TIDiER framework for providing a standardised way of reporting on an intervention – and is a great example of education learning from other fields and disciplines.


Coldwell, Michael et al. 2017. Evidence-Informed Teaching: An Evaluation of Progress in England. Research Report. London, U.K.: Department for Education.

Kislov, Roman, Paul Wilson, and Ruth Boaden. 2017. “The ‘Dark Side’of Knowledge Brokering.” Journal of health services research & policy 22(2): 107–12.

Wiggins, M et al. 2019. The RISE Project: Evidence-Informed School Improvement: Evaluation Report. London.

“It’s time we changed – converting effect sizes to months of learning is seriously flawed”

Anyone with any kind of passing interest in evidence-informed practice in schools will be aware that effect sizes are often used to report on the effects of educational interventions, programmes and policies. These results are then summarised in meta-analyses and meta-meta analyses and are often then translated into more “understandable” units, such as years or months of learning. Accordingly, John Hattie writes about an effect size of 0.4 SD being equivalent to a year’s worth of learning. Elsewhere the Education Endowment Foundation in their Teaching and Learning Toolkit have developed a table which converts different effect sizes with months of additional progress being made by pupils. For example, an effect size of 0.44SD is deemed to be worth an additional five months of learning, or an effect of size of 0.96SD representing 12 months additional learning.

However, this approach of converting effect sizes into periods of time of learning would appear to be seriously flawed. In an article recently published in Educational Research – Matthew Baird and John Pane conclude

Although converting standardized effects sizes in educations to years (or months, weeks or days) of learning has a potential advantage of easy interpretations, it comes with many serious limitations that can lead to unreasonable results, misinterpretations or even cherry picking from among implementation variants that can produce substantially inconsistent results. We recommend avoiding this translation in all cases, and that consumers of research results look with scepticism towards research translated into units of times. P227 (Baird and Pane 2019)

Instead, Baird and Pane argue that when trying to convert standardised effect sizes – which by their very nature are measured on an abstract scale – the best way in which to judge where a programme/intervention effect is meaningful is to look at what would have been the impact on the median student in the control group, if they had received the treatment/intervention. For example, assuming a normal distribution in both the intervention and control groups, the median pupil in the control group – let’s say the 13th ranked pupil in a group of 25 – if they had received the treatment and the standardised effect size was 0.4 SD the pupil would now be ranked 9th in the control group.

Well what are the implications of this for anyone working with and in schools and who are interested in evidence-informed school improvement?

• Baird and Pane’s analysis does not mean that the work of Hattie or the Education Endowment Foundation is invalid and no longer helpful. Rather it means we should be extremely careful about any claims about interventions providing benefits in terms of months or years of additional progress.

• There are additional problems with the “converting effect sizes to months of learning” approach. For example, the rate of progress of pupils’ achievement varies throughout school and across subjects (see https://onlinelibrary.wiley.com/doi/full/10.1111/j.1750-8606.2008.00061.x) and the translation doesn’t make sense for non-cognitive measures (eg, of pupils’ well-being or motivation).

• There’s an interesting balancing act to be had. On the one hand, given their knowledge and understanding of research teachers and school leaders are going to have to rely on trusted sources to help them make the most of research evidence in bringing about school improvement. On the other hand, no matter how ‘big the name’ they may well have got something wrong, so at all times some form of professional scepticism is required.

• Effect sizes and whether they can be reliably converted into some kind of more interpretable metric may be neither here nor there. What matters is whether there is a causal relationship between intervention X and outcome Y and what are the support factors necessary for that causal relationship to work, (Kvernbekk 2015).

• Given the importance that teachers and school leaders give to sources of evidence other than research – say from colleagues and other schools – when making decisions, then we probably need to spend more time helping teachers and school leaders engage in critical yet constructive appraisal of the practical reasoning of colleagues.

• Any of us involved in trying to support the use of evidence in bringing about school improvement may need to be a little more honest with our colleagues. Well if not a little more honest, maybe we need to show them a little more professional respect. Let’s no longer try and turn the complex process of education into overly simplistic measures of learning just because those same measures are easy to communicate and interpret. Let’s be upfront with colleagues and say – this stuff is not simple, is not easy, and there are no off-the shelf answers, and when using research it’s going to take extremely hard work to make a real difference to pupils’ learning – and you know what – it’ll probably not be that easy to measure

And finally

It’s worth remembering no matter what precautions you take when trying to convert an effect size into something more understandable, this does not take away any of the problems associated with effect sizes in themselves. See (Simpson 2018) for an extended discussion of these issues.


Baird, Matthew D, and John F Pane. 2019. “Translating Standardized Effects of Education Programs Into More Interpretable Metrics.” Educational Researcher 48(4): 217–28. https://doi.org/10.3102/0013189X19848729.

Hattie, J. A. 2008. Visible Learning. London: Routledge

Higgins, S., Katsipataki, M., Coleman, R., Henderson, P., Major, L. and Coe, R. (2015). The Sutton Trust-Education Endowment Foundation Teaching and Learning Toolkit. London. Education Endownment Foundation.

Kvernbekk, Tone. 2015. Evidence-Based Practice in Education: Functions of Evidence and Causal Presuppositions. Routledge.

Simpson, Adrian. 2018. “Princesses Are Bigger than Elephants: Effect Size as a Category Error in Evidence‐based Education.” British Educational Research Journal 44(5): 897–913.

Research shows that academic research has a relatively small impact on teachers’ decision-making – well what a surprise that is!

Recent research undertaken by Walker, Nelson, Bradshaw with Brown (2019) has found that academic research has a relatively small impact on teachers’ decision-making, with teachers more likely to draw ideas and support from their own experiences (60 per cent) or the experiences of other teachers/schools (42 per cent). Walker et al go onto note that this finding is consistent with previous research and go onto argue that these findings suggests that those with an interest in supporting research-informed practice in schools should consider working with and through schools, and those that support them, to explore their potential for brokering research knowledge for other schools and teachers.

In many ways we should not be surprised by these findings as similar findings about research use have been found in ethnographic research in UK general practice (Gabbay and le May, 2004) , where the findings show that clinicians very rarely accessed research findings and other sources of formal knowledge but instead preferred to rely on ‘mindlines’ – which they defined as ‘collectively reinforced, internalised, tacit guidelines. These were informed by brief reading but mainly by their own and their colleagues’ experience, their interactions with each other and with opinion leaders, patients, and pharmaceutical representatives, and other sources of largely tacit knowledge that built on their early training and their own and their colleagues' experience.

Now in this short blog I cannot do full justice to the concept of ‘mindlines’. Nevertheless if you would like to find out more I suggest that you have a look at (Gabbay and Le May, 2011; Gabbay and le May, 2016). That said, for the rest of this blog I’m going to draw on the work of (Wieringa and Greenhalgh, 2015) who conducted a systematic review on mindlines and draw out some of mindlines key characteristics

• Mindlines are consistent with the notion that knowledge is not a set of external facts waiting to be ‘translated’ or ‘disseminated’ but instead knowledge is fluid and multi-directional – and constantly being recreated in different settings by different people, and on an on-going basis.

• Mindlines involve a shared reality but not necessarily homogenous reality – made up of multiple individual and temporary realities of clinicians, researchers, guideline makers and patients

• Mindlines incorporate tacit knowledge and knowledge in practice in context

• Mindlines involves the construction of knowledge through social processes – this takes the form of discussions influenced by cultural and historic forces – and are validated through a process of ‘reality’ pushing back in a local context

• Mindlines are consistent with the view anyone including patients are capable of creating valid knowledge and can be experts in consultations

• Mindlines may not be manageable through direct interventions however they maybe self-organising as the best solution for a particular problem in a defined situation is sought out

What are the implications of ‘mindlines’ for those interested in brokering research knowledge in schools?

Wieringa and Greenhalgh go onto make of observations about the implications for practitioners, academics and policymakers if they embrace the mindlines paradigm, and which are equally applicable to schools.

1. We need to examine how to go about integrating various sources of knowledge and whether convincing information leads to improved decision-makings

2. In doing so, we need to think more widely about counts as evidence in school – and how these different types of evidence can best used by teachers and school leaders in the decision-making process

3. We need to examine how mindlines are created and validated by teachers, school leaders and other school stakeholders and how they subsequently develop over time.

In other words, research which focusses on how researchers can better ‘translate’ or ‘disseminate’ research is unlikely to have much impact on the guidelines teachers and school leaders use to make decisions.

And finally

I’d just like to make a couple of observations about the report by Walker et al (2019). First, I don’t like reports where reference is made to the ‘majority of respondents’ and no supporting percentage figure is given. Second, the terms climate and culture seem to be used interchangeably although they refer to quite different things. Third, significant differences between groups of teachers are highlighted yet no supporting data is provided or explanation of what is meant in this context by significant.


Gabbay, J. and le May, A. (2004) ‘Evidence based guidelines or collectively constructed “mindlines?” Ethnographic study of knowledge management in primary care’, Bmj. British Medical Journal Publishing Group, 329(7473), p. 1013.

Gabbay, J. and le May, A. (2016) ‘Mindlines: making sense of evidence in practice.’, The British journal of general practice : the journal of the Royal College of General Practitioners. British Journal of General Practice, 66(649), pp. 402–3. doi: 10.3399/bjgp16X686221.

Gabbay, J. and Le May, A. (2011) Organisational innovation in health services: lessons from the NHS Treatment Centres. Policy Press.

Walker, M., Nelson, J, Bradshaw, S. with Brown, C (2019) Teachers’ engagements with research; what do we know? A research briefing, London: Education Endowment Foundation,

Wieringa, S. and Greenhalgh, T. (2015) ‘10 years of mindlines: a systematic review and commentary’, Implementation Science. BioMed Central, 10(1), p. 45.

The school research lead, data-literacy and competitive storytelling in schools

A major challenge for school leaders and research champions wishing to make the most of research evidence in their school is to make sure not only that they understand the relevant research evidence but that they also understand their school context.  In particular they need to be able to analyse and interpret their own school data.  This is particularly important when discussing data and evaluating data from within your school you are in all likelihood taking part in some form or competitive storytelling (Barends and Rousseau, 2018).   The same data will be used by different individuals to tell different stories about what is happening within the school. Those stories will be used to inform decisions within the school and if we want to improve decision-making in schools it will help if decision-makers have a sound understanding about the quality of the data, on which those stories are based

 To help you get a better understanding of the data-informed stories being told in your school, in this post I’m going to look at some of the fundamental challenges in trying to understand school data and in particular, some of the inherent problems and limitations of that data.  This will involve a discussion of the following: measurement error; the small number problem; confounding; and, range restriction   In a future, I will look at some of the challenges of trying to accurately interpret the data, with special reference to how that data is presented.

The problems and limitations of school data

Measurement error

Measurement error presents a real difficulty when trying to interpret quantitative school-data.  These errors occur when the response given differs from the real value.  These mistakes may be the result of say: the respondent not understanding what is being asked of them – e.g. an  NQT not knowing what’s being measured and how to do it ; how and when the data is collected- say at 5pm on a Friday or the last day of term; or how missing data is treated – somehow it gets filled in.  These errors may be random, although they can lead to systematic bias if they are not random. 

The small number problem

When school data are based on a small number of observations, then any statistic which is calculated from them will contain random error.  For example, in schools, small departments are more likely to report data which deviates from the true value than larger departments.   For example, let’s say we have a school where the staff turnover is 20%, a small department is likely to have a greater deviation from this 20% than a larger department.  As such, you would need to be extremely careful about drawing any conclusions about the quality of leadership and management within these departments, based on this data (that said, there may be issues and other sources of data may need to be looked at)


A confound occurs when the true relationship between two variables is hidden by the influence of a third variable.  For example, the senior leadership team of a school may assume that there is a direct and positive relationship between teaching expertise and pupils’ results and may interpret any decline in results as a being the result of ‘poor’ teaching. However, it might not be the teachers expertise which is the major contributory factor in determining results.  It may have been that a number of pupils – for reasons completely beyond the control of the teacher – just did not ’perform on the day’ and made a number of quite unexpected errors.  Indeed, as Crawford and Benton (2017) pupils not performing on the day for some reason or another is a major factor in explaining differences in results between year groups

Range restriction

This occurs when a variable in the data has less than the range it possesses in the population as a whole and is often seen when schools use A level examination results for marketing purposes.    On many occasions, schools or sixth form colleges publicise A level pass rates of 98, 99 or 100%.  However, what this information does not disclose is how many pupils/students started A levels and subsequently either did not complete their programme of study or who were not entered for the examination.  Nor does it state how many pupils gained the equivalent of three A levels. So, if attention is focused on the number of pupils gaining three A levels or their equivalent, then a completely different picture of pupil success at A level or its equivalent may emerge


If you want to make sure you don’t draw the wrong conclusions from school data it will make sense to:

·     Aggregate data where appropriate so that you have larger sample sizes

·     Use a range of different indicators to try and get around the problem of measurement error with any one indicator

·     Try and actively look for data which challenges your existing preconceptions – make sure all the relevant data is made captured and made available – not just that data which supports your biases.

·     Avoid jumping to conclusions – more often than not there will be more than explanation of what happened.

And finally 

Remember even if you have ‘accurate’ data, this data can still be misrepresented through the misuse of graphs, percentages, mis-use of p-values and confidence limits. 

References and further reading

Barends, E and Rousseau, D (2018) Evidence-Based Management: How to make better organizational decisions, London, Kogan-Page, 

Crawford, C. and Benton, T. (2017b). Volatility Happens: Understanding Variation in Schools’ Gcse Results : Cambridge Assessment Research Report. Cambridge, UK

Jones, G (2018) Evidence-based School Leadership and Management: A practical guide, London, Sage Publishing

Selfridge, R (2018) Databusting for Schools. How to use and interpret education data. London, Sage Publishing