The Big Evidence Debate, effect sizes and meta-analysis - Are we just putting lipstick on a pig?

Tuesday 4 June saw a collection of ‘big name’ educational researchers – Nancy Cartwright, Rob Coe, Larry Hedges, Steve Higgins and Dylan Wiliam – coming together with teachers, researchers and knowledge brokers for the ‘big evidence debate’ about the extent to which the metrics from good quality experiments and meta-analyses can really help us improve education in practice and is meta-analysis the best we can do?

Now if you turned up for the ‘big evidence debate’ expecting academic ‘rattles’ and ‘dummies’ to be thrown out of the pram, then you were going to be disappointed, as it became quickly apparent that there was a broad consensus amongst the majority of the presenters, with this consensus – and I’ll come back to a minority view later – being summarised along the lines of:

• Most of the criticisms made of effect sizes by scholars such as Adrian Simpson – a notable absentee from those participating in the debate – have been known about by researchers for the best part of forty years.

• Many of the studies included in meta-analyses are of a low quality and more high quality and well-designed educational experiments are needed, as they form the bedrock of any meta-analysis or meta-meta analysis.

• Just because someone is a competent educational researcher does not make someone competent at undertaking a systematic review and associated meta-analysis.

• It’s incredibly difficult, if not impossible, for even the well-informed ‘lay-person’ to make any kind of judgement about the quality of a meta-analysis.

• There are too many avoidable mistakes being made when undertaking educational meta-analyses – for example - inappropriate comparisons; file-drawer problems; intervention quality; and variation in variability

• However, there are some problems in meta-analysis in education which are unavoidable; aptitude x treatment interactions; sensitivity to instruction; and the selection of studies.

• Nevertheless, regardless of how well they are done, we need to get better at communicating the findings arising from meta-analyses so that they are not subject to over-simplistic interpretations by policymakers, researchers, school leaders and teachers.

Unfortunately, there remains a nagging doubt – if we do all these things – better original research, better meta-analysis and better communication of the outcomes – then all that we may be doing – ‘is putting lipstick on a pig’. In other words, even if make all these changes and improvements in meta-analysis, they in themselves do not tell practitioners much, if anything, about what to do in their own context and setting. Indeed, Nancy Cartwright argued that whilst a randomised controlled trial may tell you something about ‘what worked’ there and a meta-analysis may tell you something about what worked in a number of places, they cannot tell you anything about whether ‘what worked there’ will ‘work here’ . She then goes onto use the image of randomised controlled trials and meta-analysis as being ‘like the small twigs in a bird’s nest. A heap of these twigs will not stay together in the wind. But they can be sculpted together is a tangle of leaves, grass, moss, mud, saliva, and feathers to build a secure nest.’ Cartwright, 2019 p13

As such, randomised controlled trials and meta-analyses should be a small proportion of educational research and should not be over-invested in. Instead, a whole range of activities should be engaged in, for example, case studies, process tracing, ethnographic studies, statistical analysis and the building of models.

Given the above what are the implications for teachers of the ‘big evidence debate’? Well if we synthesise the recommendations for teachers from both Dylan Wiliam and Nancy Cartwright, it’s possible to come up with six questions teachers and school leaders should ask when trying to use educational research to bring about improvement in schools.

1. Does this ‘intervention’ solve a problem we have?

2. Is our setting similar enough in ways that matter to other settings in which the intervention appears to have worked elsewhere?

3. What other information can we find – be it from other fields and disciplines outside of education, your own knowledge of your school and pupils – so you can derive your own causal model and theory of change of how the intervention could work?

4. What needs to be in place for the theory of change to work in our school?

5. How much improvement will we get? What might get in the way of the intervention so that good effects are negligible? Will other things happen to make the intervention redundant?

6. How much will it cost?

Links

https://www.dylanwiliam.org/Dylan_Wiliams_website/Presentations.html

Further reading

Nancy Cartwright (2019): What is meant by “rigour” in evidence-based

educational policy and what’s so good about it?, Educational Research and Evaluation, DOI:

10.1080/13803611.2019.1617990

Steven Higgins (2018): Improving Learning: Meta-analysis of Intervention Research in Education. Cambridge University Press, Cambridge, UK

Adrian Simpson (2019): Separating arguments from conclusions: the mistaken

role of effect size in educational policy research, Educational Research and Evaluation, DOI:

10.1080/13803611.2019.1617170

Dylan Wiliam (2019): Some reflections on the role of evidence in improving

education, Educational Research and Evaluation, DOI: 10.1080/13803611.2019.1617993

Disciplined Inquiry, performance review and asking well-structured and formulated questions

Recently, I wrote about how disciplined inquiry was being used by some schools as a central part of their teacher performance review scheme. Now models of disciplined inquiry will often be based around some form of structured inquiry question, such as the one put forward by the Institute of Effective Education:

What impact does (what practice?) delivered (over how long?) have on (what outcome) for (whom?)?

Two examples of this type of inquiry question have been very helpfully provided by Shaun Allison and Durrington High School

• What impact does increasing the frequency of modelling writing, followed by structured metacognitive reflection in lessons delivered over a year have on the quality of creative writing for my two Y10 classes?

• What impact does explicitly teaching Tier 2 and 3 geographical vocabulary using knowledge organisers delivered over a year have on the appropriate use of tier 2/3 vocabulary in written responses for the disadvantaged students in my Y8 class?

However, given the diversity of teaching staff, it is unlikely that a single question structure is going to meet every teachers’ needs, interests or requirements. Furthermore, it is unlikely that a single question structure is likely to be sustainable over a number of years – with teachers losing enthusiasm for ‘disciplined inquiry’ when asked to do more of the same.

With this in mind, it’s probably worth examining a number of other formats for developing structured questions. One way of doing this is to use something known a conceptual tool known as PICO – and which is explained below

• Pupil or Problem - Who? - How would you describe the group of pupils or problem?

• Intervention - What or how? What are you planning to do with your pupils?

• Comparison - Compared to what? What is the alternative to the intervention – what else could you so?

• Outcome - Aim or objective(s)- What are you trying to achieve?

Sometimes additional elements are added to PICO, including C for context and the type of school, class or setting – or T for time – which relates to the time period it takes for the intervention to achieve the outcomes you are trying to achieve.

Although PICO is probably the most well used structure to formulate questions, there are a number of different variations which could be used. These alternatives are especially useful when you focus is not just on the outcomes for pupils, but consider other issues, such as, who are the stakeholders in the situation; from whose perspective are you looking at; and, how do pupils experience the intervention. Examples of these alternative frameworks include:

  • PESCICO Pupils, Environment, Stakeholders, Intervention, Comparison, Outcome

  • EPICOT Evidence, Pupils, Intervention, Comparison, Outcome, Time-period

  • PIE Pupils, Intervention, Experience/Effect

  • SPICE Setting, Perspective, Intervention, Comparator, Evaluation

  • PISCO Pupils, Intervention, Setting, Comparison, Outcome

  • CIMO Context, Intervention, Mechanism, Outcome

  • CIAO Context, Intervention, Alternative Intervention, Outcome

Let’s now look at worked examples for two the frameworks: PISCO and SPIDER

PISCO

In this example, we are interested in whether ‘holding back’ Y8 pupils will have a beneficial impact on their learning outcomes.

  • Pupil or Problem -Who? Y8 pupils who have made insufficient progress

  • Intervention - What or how? Pupils will not progress to Y9 and will remain in Y8 and be provided with additional support

  • Setting Where ? - A secondary school in an inner-city

  • Comparison - Compared to what ? Progression to Y9

  • Outcome - Aim or objective(s) - For pupils to have caught up with pupils who progressed to Y9?

SPIDER

For this example, we are interested in the following question - What are Y7 pupils experience of transition from primary school to secondary school’

Sample of the population - Who? -Y7 pupils

PI – Pheonemna of interest- What’s taking place or happening? Pupils transition from Y6 to Y7

Design - Study Design - Interviews, focus groups and surveys

Evaluation Outcome measures Perceptions of support, expectations and attitudes towards school

Research - Type -Qualitative

What are the benefits of developing structured questions?

As a busy teacher, you may ask yourself whether it’s worth taking the time and effort to develop structured and well formulated questions. Unfortunately, there is little or no research which supports our claim for the benefits for teachers of such an approach – not for that matter that disciplined inquiry is an effective component of performance management. However, within the context of medicine and health-care seven potential benefits from the question formulation process have been identified - (Straus et al. 2011) – and which are likely to transfer to the setting of a school. These benefits include:

• Focusing your scarce professional learning time on evidence that is directly relevant to the needs of your pupils

• Concentrating professional learning time on searching for evidence that directly addresses your own requirements for enhanced professional knowledge.

• Developing time-effective search strategies to help you access multiple sources of relevant and useful evidence.

• Suggesting the forms that useful answers might look like.

• Helping you communicate more clearly when requesting support and guidance from colleagues

• Supporting your colleagues in their own professional learning, by helping them ask better questions

• Increases in level of job-satisfaction by asking well formulated questions which are then answered.

Tips for developing your question

There is no one preferred way for developing a question which forms the basis of your disciplined. However, there are a number of actions you can take which will increase the likelihood of developing a question that may lead to improvement in both your teaching and outcomes for pupils.

• Seek help from colleagues. If you school has a school research lead get their advice, they may help you refine your question or point you in the direction of colleagues who have looked into the same or similar question.

• Developing your question is an iterative process and your question will change as you discuss issues with colleagues, begin to explore the literature and your own thinking changes.

• Don’t be afraid to write down your question, even if in your mind, it will be incomplete or not yet formally formed. Keep a written record of your thinking as it develops

• Especially when developing a PICO or similar type question, particularly if you are new teachers you may find it difficult to identify both the intervention and comparator. At this stage you may want to focus on both the problem being encountered and the outcomes which you wish to bring about.

• When thinking about the comparator you might want to spend some time working on how you would describe ‘business as usual’ – as this is likely to be the comparator to whatever intervention is being considered.

• In all likelihood, for any problem you are trying to address, there will be more than one question you could ask. It will be useful to focus on a single question, when considering how to access different sources of evidence.

• Before committing any time and effort into trying to answer your well formulated questions think long and hard about whether the benefits from answering your question will outweigh the costs. Is your question: feasible, interesting, novel, ethical and relevant - (Hulley et al. 2013)

References

Hulley, Stephen B et al. 2013. Designing Clinical Research. Philadephia: Lippincott Williams & Wilkins.

Straus, S E, P Glasziou, S W Richardson, and B Haynes. 2011. Evidence-Based Medicine: How to Practice and Teach It. (Fourth Edition). Edinburgh: Churchill Livingstone: Elsevier.

We need to talk about RISE and evidence-informed school improvement - is there a crisis in the use of evidence in schools?

Recently published research - (Wiggins et al. 2019) - suggests that an evidence-informed approach to school improvement – the RISE Project – may lead to pupils to making small amounts of additional progress in mathematical and English compared to children in comparison schools. However, these differences are both small and not statistically significant so the true impact of the project may have been zero. Now for critics of the use of research evidence in schools, this may be indeed be ‘grist to their mill’ – with the argument being put forward that why should schools commit resources to an approach to school improvement which does not bring about improvements in outcomes for children. So where does that leave the proponents of research use in schools? Well I’d like to make the following observations, though I need to add these observations are made with the benefit of hindsight and may not have been obvious at the time.

First, the evidence-informed model of school improvement was new – so we shouldn’t be surprised if new approaches don’t always work perfectly first time. That doesn’t mean we should be blasé about the results and try and downplay them just because they don’t fit in with our view about the potential importance of the role of research evidence in bringing about school improvement. More thinking may need to be done to develop both robust theories of change and theories of action, which will increase the probability of success. Indeed, if we can’t develop these robust theories of change/action – then we may need to think again.

Second, the RISE Model is just one model of using evidence to bring about school improvement, with the Research Lead model being highly reliant on individuals within both Huntington School and the intervention schools. Indeed, the model may have been fatally flawed from the outset, as work in other fields, for example, (Kislov, Wilson, and Boaden 2017) suggesting that it is probably unreasonable to expect any one individual to have all the skills necessary to be a successful school research champion, cope with the different types of knowledge, build connections both within and outside of the school, and at the same time maintain their credibility with diverse audiences. As such, we need to look at different ways of increasing the collective capacity and capability of using research and other evidence in schools – which may have greater potential to bring about school improvement.

Third, the EEF’s school improvement cycle may in itself be flawed and require further revision. As it stands, the EEF school improvement cycle consists of five steps – decide what you want to achieve; identify possible solutions - with a focus on external evidence; give the idea the best chance of success; did it work; securing and spreading change by mobilising knowledge. However, for me, there are two main problems. First, at the beginning of the cycle there is insufficient emphasis on the mobilisation of existing knowledge within the school, with too much emphasis on external research evidence. The work of Dr Vicky Ward is very useful on how to engage in knowledge mobilisation. Second, having identified possible solutions the next step focusses on implementation, whereas there needs to be a step where all sources of evidence – research evidence, practitioner expertise, stakeholder views and school data – are aggregated and a professional judgment is made on how to proceed.

Fourth, some of the problems encountered – for example the high levels of turnover of staff being involved in a high-profile national project and using that as a springboard for promotion – were pretty predictable and should have been planned for at the start of the project.

Fifth, the project was perhaps over-ambitious in its scale – with over 20 schools actively involved in the intervention, and maybe the project would have benefitted from a small efficacy trial before conducting a randomised controlled trial. Indeed, there may need to be a range of efficacy trials looking at a range of different models for evidence-informed school improvement

Sixth, we need to talk about headteachers and their role in promoting evidence-informed practice in schools. It’s now pretty clear that headteachers have a critical role in supporting the development of evidence-informed practice (Coldwell et al. 2017) and if they are not ‘on-board’ then Research Leads are not going to have the support necessary for their work to be a success. Indeed, the EEF may need to give some thought not just to how schools are recruited to participate in trials but then to focus on the level of commitment of the headteacher to the trial – with a process being used to gauge headteacher commitment to research use in schools.

And finally

The EEF and the writers of the report should be applauded for the use of the TIDiER framework for providing a standardised way of reporting on an intervention – and is a great example of education learning from other fields and disciplines.

References

Coldwell, Michael et al. 2017. Evidence-Informed Teaching: An Evaluation of Progress in England. Research Report. London, U.K.: Department for Education.

Kislov, Roman, Paul Wilson, and Ruth Boaden. 2017. “The ‘Dark Side’of Knowledge Brokering.” Journal of health services research & policy 22(2): 107–12.

Wiggins, M et al. 2019. The RISE Project: Evidence-Informed School Improvement: Evaluation Report. London.