The school research lead and how to avoid being the drunkard under the lamp post

A major challenge for school leaders and teachers interested in evidence-based practice (EBP)  is to be constantly seeking out EBP 's limitations and weaknesses.  To do this I recommend that headteachers and school research leads play close attention to the work of Professor Trisha Greenhalgh, who recently authored an article entitled Of Lamp Posts, Keys, and Fabled Drunkards: A Perspectival Tale of 4 Guidelines (Greenhalgh (2018).  In this article  Professor Greenhalgh describes her own experience as a patient arising from a high impact cycling accident and how evidence-based guidelines were misused in her treatment.

The use and abuse of guidelines

Without going into the details of Professor Greenhalgh's accident - which involved coming off a bicycle at 20 mph, hitting the road surface resulting in multiple fractures -  there were a number of occasions where according to Professor Greenhalgh's account  guidelines were either misapplied or not used at all during her treatment..

  • a guideline that existed and was relevant but which was not used 
  • a guideline that was not relevant but which was used 
  • a guideline that was relevant but was misremembered and misapplied by commentators claiming to be giving evidence based advice 
  • a guideline that did not exist but which was quoted by adherents of EBM as if it had existed (and which was also misremembered and misapplied).

Professor Greenhalgh subsequently identifies three reasons why this misuse of guidelines can happen.

First, we are hard-wired to classify. So when a doctor comes across a patient the tendency to classify them as part of a group.  Once that is done, this leads to the patient being treated on the basis 'guidelines' which are designed to meet the needs of the 'average patient' not the individual.

Second …. bounded rationality-that is, the idea that because real‐world decisions often involve numerous options, outcomes, and contextual factors, we unconsciously simplify the problem to make it possible to cope with cognitively and manage practically.  Indeed, the inexorable pressures of modern clinical work often require us to use such "fast and frugal" reasoning p(6).  

Third, there is an over‐valuing of rationality (doing the thing right-as in following rules and guidelines) over reason (doing the right thing-as in making the right moral choice for this patient at this time, given these contingencies). p(6)

What are the implications for senior school leaders and school research champions?

  • Do 'average' pupils, 'average' classes or 'average' schools exist or are they all unique with their own special requirements
  • It is essential to keep up to date with the latest research and guidance provided by the Education Endowment Foundations, as otherwise you may miss out on something that could have real benefits for your pupils.
  • However, just because the EEF have produced a new set of guidance, or added something to the Teaching and Learning Toolkit, or published something promising research findings - this does not make them a priority for your school.  As they maybe other matters or issues which are far more relevant to your pupil's needs
  • In these very early days of Research Schools and relatively inexperienced school research leads there are very real risks that colleagues may get things wrong - misremembered or misapplied.  So it is really important when someone says ' the research says' that the response is 'Ok, what claim are you making and what is the warrant for your claim?' Wallace and Wray (2016) and Booth, Colob, et al. (2016)
  • What structures have been put in place to help identify the misapplication or misuse of research evidence or guidelines. What processes are in place to help address the consequences of things going 'wrong'? 
  • There is a distinction between clinical judgment and organisational judgment - clinical judgment refers to decisions made about individual patients, whereas organisational judgment is applied at scale, across the organisation.   As such, evidence-based practice may be more useful when applied to the school as a whole, rather than trying to apply it to decisions about individual pupils.
  • The evidence about a particular problem is never set in stone and there is an ongoing need for conversations to continue to unpick the nature of the problem, so the appropriate actions can be taken.

To conclude

Professor Greenhalgh cites Sir John Grimley Evans who in 1995:

There is a fear that in the absence of evidence clearly applicable to the case in the hand a clinician might be forced by guidelines to make use of evidence which is only  doubtfully  relevant,  generated  perhaps  in  a different grouping of patients in another country at some other time and using a similar but not identical treatment. This is evidence biased medicine; it is to use evidence in the manner of the fabled drunkard who searched under the street lamp for his door key because that is where the light was, even though he had dropped the key somewhere else. (page 451)

Reference

Booth, W., Colob, G., Williams, J., Bizup, J. and Fitzgerald, W. (2016). The Craft of Research (Fourth Edition). Chicago. The University Of Chicago Press.
Greenhalgh, T. (2018). Of Lamp Posts, Keys, and Fabled Drunkards: A Perspectival Tale of 4 Guidelines. Journal of Evaluation in Clinical Practice. 0. 0.
Wallace, M. and Wray, A. (2016). Critical Reading and Writing for Postgraduates (Third Edition). London. Sage.

Senior Leaders and Coaching: Are you doing more harm than good?

A recent article in the May-June 2018 edition of the Harvard Business Review reports on research conducted by Gartner, which found that a certain type of coaching - the Always on Manager - does more harm than good, with a negative impact on performance.  In addition, the Gartner study found little correlation between the time spent coaching and employee performance.

For the research Gartner, surveyed 7300 employees and managers across a number of industries, along with interviewing or surveying 325 HR executives and found four different approaches to coaching:

Teacher Managers – coach employees on the basis of their own knowledge and experiences, providing advice oriented feedback and personally directing development.  Many have expertise in technical fields and spent years as individual contributors before working their way into managerial roles.

Always on Managers provides continual coaching, stay on top of employees’ development and give feedback across a range of skills.  Their behaviors closely align with what HR professionals typically idealize.  These managed may appear to be the most dedicated of the four types to upgrading their employees’ skills – they treat it as part of their daily job.

Connector Managers give targeted feedback in their areas of expertise; otherwise, they connect employees with others on the team or elsewhere in the organisation who are best suited to the tasks.  They spend more time than the other three types assessing the skills, needs, and interests of their employees, and they recognise that many skills are best taught by people other than themselves.

Cheerleader Managers  take a hands off approach, delivering positive feedback and putting employees in charge of their own development.  They are available and supportive, but they aren’t as proactive as the other types of managers when it comes to developing employees’ skills. (Harvard Business Review

The article goes on to note that:
  • The four types are more or evenly distributed within organisations, regardless of the industry
  • Whether a manager spends 36% of 9% of their time on coaching and employees development – it did not seem to matter – it’s more about the quality than the quantity of coaching
  • Hyper-vigilant always on managers appear to do more harm than good .
The article highlights three reasons why Always on Managers have a negative impact on performance 
  1. The continual stream of feedback is often overwhelming
  2. They spend less time time focussing on employees’ needs and more time on issues that are less relevant to employees real needs
  3. They fail to recognize the limits of their own expertise and is effectively making it up as they go long
On the other hand, employees manged by Connectors were three times more  likely to be high performers than employees managed by the other types of coaches.  The article notes that form  the research this seemed to be explained by Connectors doing four things:
  1. Asking the right questions
  2. Providing tailored feedback
  3. Helping colleagues connect and network with other colleagues who can help them
  4. Recognise the limits of their own skills 
The Gartner researchers then go onto recommend that managers should take the following action
  • Focus on quality of coaching not the quantity 
  • Find out about your employees’ aspirations for the future and the skills, knowledge and experience they need to achieve those aspirations
  • Have open coaching conversation, shifting the focus from one to one conversations to team coaching, where colleagues learn from one another, particularly those with specific skills
  • Try and extend these activities across the organisation
So what are the implications of these finding for senior leadership teams?

It seems to me that there are several implications.
  • A bit of humility goes a long way – it’s ok as leader to admit that you are not an expert on something and point colleagues in the direction of others – and requires the development of a culture of trust and mutual vulnerability.
  • Your most proactive coaches and line managers, who are constantly given feedback may inadvertently be making things worse.
  • Give some thought to the types coaching currently evident in your school and reflect on whether they are doing more harm than good.
  • Focus on the quality of the coaching being given given rather than the quantity.
  • Line Managers may not necessarily be in the best position to be coaches, unless they have the appropriate skills
  • When appointing staff to senior roles, you may wish to ask interviewees to give examples of how they have gone about coaching others and look for evidence of ‘connecting’ activities
  • When developing your own career, you may wish to look to work for leaders and managers who have a connecting coaching style.
And finally 

As Professor Steve Higgins of the University of Durham said when commenting on a meta-analysis on coaching by Kraft, Blazar, et al. (2016) – ‘it aint what you do it’s the way that you do it’

PS

It’s important to note that this research was not conducted in schools – so there are issues as to the applicability to schools in England.  In addition, with both this type of research and reporting – there would be some merit at looking at the original research conducted by Gartner, which to be honest I have not been able to do.

References

“Coaching vs Connecting: What the Best Managers Do to Develop Their Employees Today’ – Gartner, White Pape
Managers Can’t Be Great Coaches All by Themselves. Harvard Business Review. May- June, 2018.
Kraft, M. A., Blazar, D. and Hogan, D. (2016). The Effect of Teacher Coaching on Instruction and Achievement: A Meta-Analysis of the Causal Evidence.


The school research lead - understanding p-values, statistical significance and avoiding misconceptions

A major challenge for aspiring  evidence-informed teachers is knowing when to trust the experts.  It would be easy to assume that just because you have come across a particular interpretation of a concept or idea in a number different places - book, peer-reviewed article or blog - that it is correct.  Unfortunately, if you did this, you could well be making a mistake.   For example, in recent weeks I have come across three examples – Churches and Dommett (2016), Firth (2018) and Ashman (2018) - where the meaning of p-values and statistical significance would appear to have been misinterpreted.  Furthermore, as Gorard et al (2017) states this mistakes are not uncommon. So to help aspiring  school research leads and evidence-informed teachers spot where p-values and statistical significance have been misinterpreted I will:
  • Explain what is meant by the terms  p-values and statistical significance
  • Identify a number of common misconceptions about p values and statistical
  • Show how the work of Churches and Dommett, Firth and  Ashman all fall foul of some of these misconceptions and misinterpretations
  • Examine some of the implications for evidence-informed teachers.
And to help me do this I’m going to draw upon the work of Greenland, Senn, et al. (2016), the American Statistical Association and Wasserstein and Lazar (2016) 

P values and statistical significance

When seeking to understand these terms there are  a number of major problems and as Greenland, et al. (2016) state: ‘There are no interpretations of these concepts, which are at once simple, intuitive, correct, and foolproof’ (p337). Greenland et al go onto illustrate their point by providing twenty-five examples of common misconceptions and interpretation of these terms, which even professional academics are prone.   Nevertheless, the American Statistical Association seek to informally  define a p-value as: the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.

The smaller the p value, the more unlikely are our results if the null hypothesis (and test assumptions) hold true.   Whereas, the larger the p value, the less surprising are our results, given the null hypothesis and (test assumptions) hold true.  In other words, as Greenland et al state: ‘The P value simply indicates the degree to which the data conform to the pattern predicted by the test hypothesis and all the other assumptions used in the test (the underlying statistical model). Thus P =  0.01 would indicate that the data are not very close to what the statistical model (including the null hypothesis) predicted they should be, while P =  0.40 would indicate that the data are much closer to the model prediction, allowing for chance variation’. p340

Statistical Significance

Put very simply a result is often deemed to be statistically significant if the p value is less than or equal to 0.05, although the level of statistical significance can be set lower levels, for example, p is less than or equal 0.01  

Interpreting p values and statistical significance – guidance from the American Statistical Association

Given difficulties in interpreting p values and statistical significance the American Statistical Association  - Wasserstein and Lazar (2016) – have provided some guidance on how to avoid some common mistakes.  This guidance is summarised in six principles
  • P-values can indicate how incompatible the data are with a specified statistical model.
  • P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone. 
  • Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. 
  • Proper inference requires full reporting and transparency
  • A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  • By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis. 

Some common misinterpretations – Churches, Dommett, Firth and Ashman

I will now look at how Churches and Dommett, Firth. and Ashman have - in my view-  all misinterpreted either p-values or statistical significance.  

Richard Churches and Eleanor Dommett - In their book Teacher-Led Research : Designing and implementing randomised controlled trials and other forms of experimental research – t include the following definitions within their glossary of terms

p-value – Probability value – that is the probability that the result may have occurred by chance (e.g p = 0.001 – a 1 in 1000 probability that the result may have happened by chance) Also known as the significance level.

Significance – The probability that a change in score may have occurred by chance. A threshold for significance (alpha) is set at the start of piece of research.  This is never less stringent than 0.05 ….

Unfortunately,  according to the ASA  both of these statements are incorrect. First, the p-value is a measure of the consistency of the results with a particular statistical model – with all the assumptions behind the model being maintained.  Second,  the p-value is not the probability that the data were produced by random chance alone as it also depends on the accuracy of the assumptions underpinning the statistical model  Third, the definition of significance conflates scientific significance, with statistical significance.
   
Jonathan Firth - Firth, J. (2018). The Application of Spacing and Interleaving Approaches in the Classroom Impact. 1. 2.

In a recent edition of Impact, Jonathan Firth uses p-values and statistical significance to the application of spacing an interleaving in the classroom where an opportunity sample of 31 school pupils between 16 and 17 years of age was used.

The mean percentages of correct answers on the end-of-task test for the interleaved and blocked conditions are shown in Figure 4. A between-subjects ANOVA was carried out. This analysis revealed a significant main effect of spacing (performance in the spaced condition being worse than the massed condition, with mean scores of 12.25 vs 9.45, p = .002), while interleaving did not have a significant main effect.  Importantly, there was also a significant (p = .009) interaction between the two variables (spacing vs interleaving), indicating that interleaving had a mediating or protective effect against the difficulties caused by spacing (see Figure 5). 

The findings demonstrated that spacing had a harmful effect on the immediate test, while the main effect of interleaving was neutral. The results fit with the idea that these are ‘desirable difficulties’, with the potential to impede learning in the short term. .   

Again, according to the ASA there are errors in both paragraphs.  Statistical significance does not demonstrate whether an  a scientifically or substantively important/significant relation has been detected.  Neither is statistical significance a property of the phenomenon being studied, but is a product of the consistency between the data and what would have been expected using the specified statistical model.  In other words, the map is not the territory. 

Greg Ashman - Ashman (2018) The Article That England’s Chartered College Will Not Print. Filling the Pail.

In a blogpost which criticises the EEF’s approach to both meta-cognition and meta-analysis, Greg also falls foul of the problems of interpreting p-values and statistical significance

If we focus only on the randomised controlled trials conducted by the EEF, the case for meta-cognition and self-regulation seems weak at best. Of the seven studies, only two appear to have statistically significant results. In three of the other studies, the results are not significant and in two more, significance was not even calculated. This matters because a test of statistical significance tells us how likely we would be to collect this particular set of data if there really was no effect from the intervention. If results are not statistically significant then they could well have arisen by chance.

Again using the ASA’s guidance there are a number of errors in this statement.   First, statistical significance – or rather the lack of it – does not tell us whether there was no effect from the intervention. It just tells us the data was inconsistent with the statistical model.  Second, even if the results are or are not statistically significant it does not mean the results have arisen by chance. It is a statement about data in relation to a specified hypothetical explanation, and is not a statement about the explanation itself.  In other words, it is a statement about the results of the study relative to a particular statistical model.

Where does this leave us?

First, p-values, significance and statistical are slippery concepts, which take time and effort to even begin to understand never alone master.  Indeed, you may need to forget what you have already learnt at university on under-graduate or post-graduate courses.

Second, misuse of p-values and statistical significance is not uncommon, so is something you have to watch out for when reading quantitative research reports.  So keep the ASA principles hand to see if they are being misapplied in research reports.  You don’t have to understand something and how it works, (though it helps) to be able to spot it misuse.

Third, just because you can come across something in a variety of formats – book, peer-reviewed article or blog and from a variety of authors – researchers, researchers at university, or school teachers - does not mean it is correct.

Fourth, I am not making personal comments about the personal integrity of any of the authors I have criticised.  These comments should be seen as ‘business not personal’ and are a genuine attempt to increase the research literacy of teachers and school leaders.  Being an evidence-informed teacher or school leaders is hard enough when you are using the right,  never mind the wrong, tools.

And finally,  it’s worth remembering the words of Greenland, et al. (2016) who state: ‘In closing, we note that no statistical method is immune to misinterpretation and misuse, but prudent users of statistics will avoid approaches especially prone to serious abuse. In this regard, we join others in singling out the degradation of P values into ‘‘significant’’ and ‘‘nonsignificant’’ as an especially pernicious statistical practice.’ p348.

References

Ashman, G. (2018). The Article That England’s Chartered College Will Not Print. Filling the Pail. https://gregashman.wordpress.com/2018/04/17/the-article-that-englands-chartered-college-will-not-print/. 21 April, 2018.
Churches, R. and Dommett, E. (2016). Teacher-Led Research: Designing and Implementing Randomised Controlled Trials and Other Forms of Experimental Research. London. Crown House Publishing.
Firth, J. (2018). The Application of Spacing and Interleaving Approaches Int He Classroom Impact. 1. 2.
Gorard, S., See, B. and Siddiqui, N. (2017). The Trials of Evidence-Based Education. London. Routledge
Greenland, S., Senn, S., Rothman, K., Carlin, J., Poole, C., Goodman, S. and Altman, D. (2016). Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations. European journal of epidemiology. 31. 4. 337-350.

Wasserstein, R. and Lazar, N. (2016). The Asa's Statement on P-Values: Context, Process, and Purpose, the American Statistician, 70:2, 129-133,. The American Statistician. 70. 2. 129-133.

Trust in Schools

Anyone with a passing acquaintance with Twitter and who is interested in education will know that the notion of 'trust' in schools - or lack of it - is constantly being commented upon  Governments should trust schools more to get on with the job of educating pupils. CEOS of multi-academy should trust their senior leaders of schools more to come up with local solutions for school problems .  Senior leaders should trust teachers  more to know what is best for their pupils and let them get on with the job of teaching in the classroom.  Teachers should trust senior leaders to know what is right for the school.  Parents should trust teachers to do their best for children.  Teachers should trust pupils to take responsibility for their own learning.  In other words,  trust is a good thing and there should be more of it.  However, high levels trust are not easy to create, develop and maintain, and can be very easily lost.  So in this post, I will use the work of Romero and Mitchell (2018) to explore:

  • The importance of trust in schools.
  • The nature of trust.
  • Implications for the leadership and management of schools in creating, maintaining and developing trust.

The importance of trust in schools

Romero and Mitchell provide a range of supporting evidence to support the folllowing claims

  • Trust is important in high functioning modern institutions
  • Trust is a defining characterstic of professional work
  • Trust between teachers and lead leader plays an importants role in attempts to collaborate, openness to new ideas, mentoring and professionalism
  • Student trust of teachers is associated, for example, with academic achievement and good behaviour
  • Trust is essential for effective partnerships between schools and parents.
  • Trust is important between the different levels of an educational organisation, system or institution 

However, whether these claims are fully warranted would depend upon a careful analysis of the supporting evidence for each claim: Wallace and Wray (2016).  Nevertheless,  for the purposes of this post, I am going to assume that each claim stands up to critical scrutiny.

The nature of trust

The components of trust is subject to some debate,  Adams and Miskell (2016) hypothesising that trust constists of five components - benevolence, competences, honesty, openness, and reliablity.  Bryk and Schneider (2002) argues that relational trusts consists of of four components - respect, personal regard, personal integrity and competence in core responsibiltiies.  However, Romero and Mitchell state that trusts has three key facets:

  • Benevolence - is the sense that the trusted party has the trustee's best interests at heart.
  • Competence - reflects the belief that the trustee has the needed skills and abilities
  • Integrity - reflects the belief that the trustee will behave fairly and ethically.

As such, Romero and Mitchell argue that trust is effectively a second-order factor, and is a function of the levels of all three facets, with each being present to varying degrees.  This has has a number of consequences for both attempts to measure trust i.e. the need to measure all three facets, but also how to develop, maintain or report trust in schools.   For example, there may low levels of trust in a school, even if individuals act with high levels of benevolence and integrity, but with low levels of competence.  In other words, trust requires the presence of high levels of benevolence, competence and integrity.

What are the implications for trust in schools.

It seems to me that this analysis has a number of implications for schools and school leaders.

  1. Given the interrelationship between trust and each of benevolence, competence and integrity,  maybe low levels of trust within schools is likely to be the norm. This does not mean low levels of trust should deemed acceptable, instead it should be seen as a recognition of the challenge of creating high trust environments. 
  2. If school leaders wish to develop levels of trust within a school, it will involve spinning 'multiple plates' - just being deemed to be a good person or good at your job will not be enough to generate trust.
  3. The actions necessary to develop trust in schools - will depend very much on the situation in each school.  If there are perceived low levels benevolence, competence and integrity, this will require sustained action across all three factors.  Whereas, if there are concerns about leader competence - it may require a school leader to focus on doing the basics of school leadership - managing pupil behaviour, recruiting staff on time, and keeping the books balanced.
  4. If you accept the notion that of what its meant to be competent changes over time, with increasing levels of performance being required to be competent, then schools then schools have no choice but to constantly investing in the professional learning and development of ALL staff.
  5. At whatever level of the school system you operate at - be it a CEO of Mat, school leader, head of department, teacher or teacher assistance - do not take trust for granted, as it can so easily slip through your fingers and disappear
  6. Probably the simplest thing to do when trying to develop a high trust environment is adopt Bob Sutton's No Asshole Rule, Sutton (2007)

References

Adams, C. M. and Miskell, R. C. (2016). Teacher Trust in District Administration:A Promising Line of Inquiry. Educational Administration Quarterly. 52. 4. 675-706.
Bryk, A. and Schneider, B. (2002). Trust in Schools: A Core Resource for Improvement. New York. Russell Sage Foundation.
Sutton, R. I. (2007). The No Asshole Rule: Building a Civilized Workplace and Surviving One That Isn't. London. Hachette UK.
Wallace, M. and Wray, A. (2016). Critical Reading and Writing for Postgraduates (Third Edition). London. Sage.

School Leadership and Civility


In a recent post I argued that both procedural and interactional justice within schools are  essential components of promoting teacher and organisational well-being.  Indeed, thanks to a retweet by Jill Berry @jillberry102 this post generated some traffic on Twitter, the vast majority of which was supportive.  However, the post was interpreted by some as 'SLT bashing,' which was never the intent.  Ironically, the post was designed to be supportive of SLTs by identifying evidence-based strategies which could be adopted and which may reduce both staff turnover and teachers leaving the profession.

In this post I'm going to continue to look at strategies which can support interactional justice within schools.  In doing so, I'm going to look at the work of Christine Porath (Porath, 2018) on how promoting civility can have an important role in developing interactional justice.   Porath argues that if you want colleagues to be 'civil' to one another it is important that leaders engage in conversation with team members to establish precisely what civility means.  By doing this, Porath argues that it then becomes much easier to generate support for 'civility' as a way of doing things, and at the same time empowers colleagues to hold each other to account.

Porath then goes onto describe a law firm's (Bryan Cave)  code of civility which has 10 elements.

Bryan Cave's Code of Civility

1 We greet and acknowledge each other.

2 We say please and thank you.

3 We treat each other equally and with respect, no matter the conditions.

4 We acknowledge the impact of our behaviour on others.

5 We welcome feedback from each other.

6 We are approachable.

7 We are direct, sensitive, and honest.

8 We acknowledge the contributions of others.

9 We respect each other's time commitments.

10 We address incivility.


However Porath then argues that it is not enough to define cultural norms of civility, they need to receive specific training which examines

" What civility looks like
" Situations where colleagues may act with a lack of civility
" Techniques to maintain civility when under pressure
" Opportunities to practise being civil

So what are the implications for school leaders?

If you accept the notion that how you behave has an impact on others, and that  school leader civility may be an important part of a school's strategy for retaining staff, the following may be worth considering.

1. Keep a  daily civility diary and record where you have may behaved in way which lacked civility - and reflect on what might have triggered that behaviour.
2. Ask a colleague to observe how you behave in meetings and other settings - and whether they can identify occasions where you have acted in a manner - which could be described as disrespectful to others.
3. See if you can spot when colleagues have acted with a lack of civility towards one another and ask the following:
a. Did you intervene?
b. Is this behaviour new ?
c. What are you going to do about it?

And finally 

Am I holding myself up as a paragon of virtue when it comes to civility, absolutely not.   What I do know is that as a senior leader I could have done a better job at being civil and I should have been more proactive when  colleagues displayed less than 'civil' behaviour towards colleagues.  In future posts I will begin to explore the role of trust within schools.

Reference

PORATH, C. 2018. Make Civility the Norm on Your Team Harvard Business Review. Cambridge, : Harvard Business Review.