Blog - Evidence-Based Educational Leadership — Evidence-based school leadership and management: A practical guide

Please note - since this blogpost was published I have come across an updated version of Professor Kraft’s paper which can be found here

I’ll be posting an update to this blogpost in the coming weeks

In England there are approximately 24,000 schools – which means that next week will see thousands of INSET/CPD days taking place. In all likelihood, in a great number of those sessions, someone leading the session, will make some reference to effect sizes to compare different educational interventions However, as much it might be appealing to use effect sizes to compare interventions, this does not mean that just comparing effect sizes tells you anything useful about the importance of the intervention for you and your school. So to help you get a better understanding of how to use effect sizes I’m going to draw upon the work of (Kraft, 2018 )who has devised a range of questions that will help you interpret the effect size associated with a particular intervention. But as it is the start of the academic year – it might be useful to first revisit how an effect size is calculated and some existing benchmarks for interpreting effect sizes.

Calculating an effect size

In simple terms, an effect size is a ‘way of quantifying the difference between two groups’p339. (Coe, 2017)and can be calculated by using the following formula

Effect size = ((Mean of experimental group) – (Mean of control group))/Pooled standard deviation

To illustrate how an effect size is calculated makes Coe reference to the work of (Dowson, 2002) who attempted to demonstrate the time of day effects on children’s learning, or in other words do children learn better in the morning or afternoon?

Thirty-eight children were included in the intervention with half being randomly allocated to listen to a story and respond to questions, at 9.00 am., whereas the remaining 19 students listened to the same story and questions at 3.00 am.
The children’s understanding of the story was assessed by using a test where the number of correct answers was measured out of twenty.
The morning group had an average of score of 15.2, whereas the afternoon group had an average score 17.9, a difference of 2.7.
The effect size of the intervention can now be calculated (17.9-15.2)/3.3 equals 0.8 SD.

Benchmarks for interpreting effect sizes

A major challenge when interpreting effect sizes is that there is no generally agreed scale. One of the most widely used set of benchmarks comes from the work of Jacob Cohen (Cohen, 1992) who defines small, medium and large effect sizes as 0.2, 0.5 and 0.8 SD respectively. However, this effect size scale was derived to identify the sample size required to give yourself a reasonable chance of detecting and effect size of that size, if it existed. As (Cohen, 1988) notes: “The terms ‘small,’ ’medium,’ and ‘large’ are relative to each other, but to the area of behavioural science, or even more particularly to the specific content of the research method being employed in any given investigation.’ p25. As such, Cohen’s benchmark should not be used to interpret the magnitude of an effect size.

Alternatively you could use the ‘hinge-point’ of 0.4 SD put forward by John Hattie (Hattie, 2008) who reviewed over 800 meta-studies and argues that the average effect size of all possible educational influences is 0.4 SD. Unfortunately, as (Kraft, 2018)notes Hattie’s meta-analysis includes studies with small samples, weak research designs and proximal measurements - all of which result in larger effect sizes. As such, the 0.4SD hinge point is in all likelihood an over estimate of the average effect size. Indeed, (Lipsey et al., 2012) argue that based on empirical distributions of effect sizes from comparable studies an effect sizes of 0.25 SD in education research should be interpreted as large. Elsewhere, (Cheung and Slavin, 2015) found that the average effect sizes for interventions ranged from 0.11 to 0.32SD depending upon sample size and the comparison group

You could also look at the work of (Higgins et al., 2013) and the Education Endowment Foundation’s Teaching Learning Toolkit who suggest that low, moderate, high and very effect sizes are -0.01 to 0.18, 0.19 - 0.44, 0.45 - 0.69, 0.7SD + respectively. However, it’s important to note is based on the assumption that the effect size of a year’s worth of learning at elementary school is 1 SD – yet (Bloom et al., 2008) found that for six-year olds a year’s worth of growth is approximately 1.5 standard deviations, whereas for twelve-year olds a year’s worth of growth was 0.2 standard deviations.

Finally, you could refer to the work of (Kraft, 2018)who undertook an analysis of 481 effect sizes from 242 RCTs of education interventions with achievement outcomes and came up with the following effect size benchmarks for school pupils: less than 0.05 is Small to less than 0.20 is Medium and, 0.20 or greater is large. Nevertheless, as Kraft himself notes ‘these are subjective but not arbitrary benchmarks. They are easy heuristics to remember that reflect the findings of recent meta-analyses.’ p18.

Kraft’s Guidelines for interpreting effect sizes

The above would suggest that attempting on interpret effect sizes by the use of standardised benchmarks is not an easy task – different scales suggest that large effect sizes range from 0.2 to 0.8 SD. As such, if we go back to the 0.8SD effect size Dowson found when looking at the time of day effects on pupil does this mean we have found an intervention with a large-effect size, which you and your school should look to implement. Unfortunately, as much as you might like to think so, it’s not that straightforward. Effect sizes are not just determined by the effectiveness of the intervention but by a range of other factors – see (Simpson, 2017) for a detailed discussion. Fortunately, (Kraft 2018) has identified a number of questions that you can ask to help you interpret effect sizes and this and in Table 1 we will now apply these questions to Dowson’s findings

As such, given the nature of the intervention and in particular given both the relatively short period of time between intervention and the measurement of the outcomes and the outcomes being closely aligned to the intervention, we should not be overly surprised that we have an effect size which might ‘at first-blush’ be interpreted as large.

Implications for teachers, school research leads and school leaders

One, it is necessary to extremely careful to avoid simplistic interpretations of effect sizes. In particular, where you see Cohen’s benchmarks being used, this should set off the alarm bells about the quality of the work you are reading.

Two, when interpreting the effect-size of an intervention – particularly in single studies where the effect size is greater than 0.20SD it’s worth spending a little time in applying Kraft’s set of questions – to see if there are any factors which are contributing to upward pressures on the resulting effect size.

Three, when making judgments about an intervention – and whether it should be introduced into your school – the effect size is only piece of the jigsaw. Even if an intervention has a relatively small effect size, the intervention may still be worth implementing if the costs are relatively small, the benefits are quickly realised, and it does not require a substantial change in teachers’ behaviour

Last but not least, no matter how large an effect size of an intervention – what matters are the problems that you face in your classroom, department or school. Large effect sizes for interventions that will not solve a problem you are faced with, are for you, largely irrelevant.

References

Bloom, H. S. et al.(2008) ‘Performance trajectories and performance gaps as achievement effect-size benchmarks for educational interventions’, Journal of Research on Educational Effectiveness, 1(4), pp. 289–328.

Cheung, A. and Slavin, R. E. (2015) ‘How methodological features affect effect sizes in education’, Best Evidence Encyclopedia, Johns Hopkins University, Baltimore, MD. Available online at: http://www. bestevidence. org/word/methodological_Sept_21_2015. pdf (accessed 18 February 2016).

Coe, R. (2017) ‘Effect size’, in Coe, R. et al. (eds) Research Methods and Methodologies in Education (2nd edition). London: SAGE.

Cohen, J. (1988) ‘Statistical power analysis for the behavior science’, Lawrance Eribaum Association.

Cohen, J. (1992) ‘A power primer’, Psychological bulletin, 112(1), p. 155.

Dowson, V. (2002) Time of day effects in schoolchildren’s immediate and delayed recall of meaningful material, TERSE Report. CEM, University of Durham. Available at: http://www.cem.dur.ac.uk/ebeuk/research/terse/library.htm.

Hattie, J. (2008) Visible learning: A synthesis of over 800 meta-analyses relating to achievement. London: Routledge.

Higgins, S. et al.(2013) The Sutton Trust-Education Endowment Foundation Teaching and Learning Toolkit Manual. London: Education Endowment Foundation. Available at: internal-pdf://228.60.152.86/Technical_Appendices_(June_2013).pdf.

Kraft, M. A. (2018) Interpreting effect sizes of education interventions. Brown University Working Paper. Downloaded Tuesday, April 16, 2019, from ….

Lipsey, M. W. et al.(2012) ‘Translating the Statistical Representation of the Effects of Education Interventions into More Readily Interpretable Forms.’, National Center for Special Education Research. ERIC.

Simpson, A. (2017) ‘The misdirection of public policy: comparing and combining standardised effect sizes’, Journal of Education Policy. Routledge, pp. 1–17. doi: 10.1080/02680939.2017.1280183.

No doubt as the summer holidays draw to a close and the new term approaches, there will be teachers, school research leads and senior leaders who will be preparing to deliver a start of term INSET/CPD session, which might have as a focus, evidence-informed practice. So to help those colleagues with their preparation for such as session, I thought it might be useful to share a range of resources - books, blogs, resources available online and institutional webites which colleagues might find useful in their preparations. It’s not an exhaustive list of resources, on the other hand, it might point you in the direction of something which helps you deliver a session which has real value for colleagues. So here goes:

Books

Ashman, G. (2018) The Truth about Teaching: An evidence-informed guide for new teachers. London: SAGE.

Barends, E. and Rousseau, D. M. (2018) Evidence-based management: How to use evidence to make better organizational decisions. London: Kogan-Page.

Brown, C. (2015) ‘Leading the use of research & evidence in schools’. London: IOE Press.

Cain, T. (2019) Becoming a Research-Informed School: Why? What? How? London: Routledege.

Didau, D. (2015) What if everything you knew about education was wrong? Crown House Publishing.

Hattie, J. and Zierer, K. (2019) Visible Learning Insights. London: Routledge.

Higgins, S. (2018) Improving Learning: Meta-analysis of Intervention Research in Education . Cambridge : Cambridge University Press.

Kvernbekk, Tone. 2015. Evidence-Based Practice in Education: Functions of Evidence and Causal Presuppositions. London: Routledge.

Netolicky, D. 2019. Transformational Professional Learning: Making a Difference in Schools. London: Routledege.

Petty, G. (2009) Evidence-based teaching: A practical approach. Nelson Thornes.

Weston, D. and Clay, B. (2018) Unleashing Great Teaching: The Secrets to the Most Effective Teacher Development. Routledge.

Wiliam, D. (2016) Leadership for teacher learning. West Palm Beach: Learning Sciences International.

Willingham, D. (2012) When can you trust the experts: How to tell good science from bad in education. San Francisco: John Wiley & Sons.

Blogs

Rebecca Allen https://rebeccaallen.co.uk

Christian Bokhove https://bokhove.net

Larry Cuban https://larrycuban.wordpress.com

Centre for Evaluation and Monitoring http://www.cem.org/blog/

Harry Fletcher-Wood https://improvingteaching.co.uk

Blake Harvard https://theeffortfuleducator.com

Ollie Lovell http://www.ollielovell.com/about/

Alex Quigley https://www.theconfidentteacher.com/category/evidence-in-education/

Tom Sherrington https://teacherhead.com

Robert Slavin https://robertslavinsblog.wordpress.com

Other resources

Barwick M. (2018). The Implementation Game Worksheet. Toronto, ON The Hospital for Sick Children

CEBE (2017) ‘Leading Research Engagement in Education : Guidance for organisational change’. Coalition for Evidence-Based Education.

CESE (2014) ‘What Works Best: Evidence-based practice to help NSW student performance’. Sydney, NSW: Centre for Education Statistics and Evaluation.

CESE (2017) ‘Cognitive Load Theory: Research that teachers need to understand’. Sydney, NSW: Centre for Education Statistics and Evaluation

Coe, R, and S Kime. 2019. “A (New) Manifesto for Evidence-Based Education: Twenty Years On.” Sunderland, U.K.: Evidence-Based Education

Coe, R. et al.(2014) What makes great teaching? Review of the underpinning research. London: Sutton Trust.

Deans for Impact (2015) ‘The Science of Learning’. Austin, TX: Deans for Impact.

Dunlosky, J. (2013) ‘Strengthening the student toolbox: Study strategies to boost Learning.’, American Educator. ERIC, 37(3), pp. 12–21.

IfEE (2019) ‘Engaging with Evidence’. York: Institute for Effective Education.

Metz, A. & Louison, L. (2019) The Hexagon Tool: Exploring Context. Chapel Hill, NC: National Implementation Research Network, Frank Porter Graham Child Development Institute, University of North Carolina at Chapel Hill. Based on Kiser, Zabel, Zachik, & Smith (2007) and Blase, Kiser & Van Dyke (2013).

Nelson, J. and Campbell, C. (2017) ‘Evidence-informed practice in education: meanings and applications’, Educational researcher, 59(2), pp. 127–135.

Rosenshine, B. (2012) ‘Principles of Instruction: Research based principles that all teachers should know’. Spring 2012: American Educator.

Stoll, L.et al.(2018) ‘Evidence-Informed Teaching: Self-assessment tool for teachers’. London, U.K.: Chartered College of Teaching

Useful websites

Best Evidence in Brief Fortnightly newsletter which summarises some of the most recent educational research https://www.beib.org.uk

Best Evidence Encyclopaedia The Best Evidence Encyclopaedia is a web site created by the Johns Hopkins University School of Education and provides summaries of scientific reviews and is designed to give educators and researchers fair and useful information about the evidence supporting a variety of teaching approaches for school students www.bestevidence.org.uk

Campbell Collaboration The Campbell Collaboration promotes positive social and economic change through the production and use of systematic reviews and other evidence synthesis for evidence-based policy and practice – 38 of which have been produced for education https://campbellcollaboration.org/

Chartered College of Teaching The professional association for teachers in England – provides a range of resources for teachers interested in research-use https://chartered.college/

Deans for Impact A group of senior US teacher educators who are committed to the use research in teacher preparation and training https://deansforimpact.org/

Education Endowment Foundation Guidance Reports Provides a range of evidence-based recommendations for how teachers can address a number of high priority issues https://educationendowmentfoundation.org.uk/tools/guidance-reports/

Education Endowment Foundation Teaching and Learning Toolkit A summary of the international evidence on teaching and learning for 5 -16-year olds https://educationendowmentfoundation.org.uk/evidence-summaries/teaching-learning-toolkit/

EPPI-Centre Based at the Institute of Education, University College London – the EPPI Centre is a specialist centre for the development and conduct of systematic reviews in social sciencehttps://eppi.ioe.ac.uk/cms/

Evidence for Impact Provides teachers and school leaders with accessible information on which educational interventions have been shown to be effective.www.evidence4impact.org.uk

Institute for Education Sciences The Institute of Education Sciences (IES) is the statistics, research, and evaluation arm of the U.S. Department of Education, whose role is to provide scientific evidence on which to ground education practice and policy and to share this information in formats that are useful and accessible https://ies.ed.gov/

Research Schools Network A group of 32 schools in England – supported by the Education Endowment Foundation and the Institute of Effective Education – who support the use of evidence to improve teaching practice. https://researchschool.org.uk/

Teacher Development Trust Provides access to resources for teachers interested in research use and continuous professional development https://tdtrust.org/

The Learning Scientists A US based group of cognitive scientists who We are cognitive psychological scientists are interested in the science of learning and who want to make scientific research on learning more accessible to students, teachers, and other educators. http://www.learningscientists.org/

What Works Clearinghouse Part of the IES – the What Works Clearinghouse – reviews educational and determine which studies meet rigorous standards, and summarize the findings, so as to question “what works in education https://ies.ed.gov/ncee/wwc/

And remember

Just because a writer, text or organisation appears on the above lists, you still need to critically engage with what is said/written. You still need to ask: What is it? Where did I find it? Who has written/said this? When was this written/said? Why has this been written and/said?How do I know if it is of good quality? (Aveyard, Sharp, and Woolliams 2011)

Reference

Aveyard, H., Sharp, P. and Woolliams, M. (2011) A beginner’s guide to critical thinking and writing in health and social care. Maidenhead, Berkshire: McGraw-Hill Education (UK).

Thousands of days of INSET and a deluge references to effect sizes

Useful sources of information for teachers, school research leads and senior leaders