It can be very hard to weigh up evidence on educational topics or certain interventions.
After all, is it really a choice if you have a lesson series that no teacher would use but seems effective, or a lovely intervention package without evidence? In these cases, teachers will make a judgement based on their knowledge and experience of their craft. And that is all good and well.
For me it does become an issue, though, when people want to signal how “evidence-informed” they are, but apply different standards to the evidence selected. For example, they might be critical of a particular study because it has limitations (like every social science study under the sun), but are more forgiving with studies that have an outcome they like.
More from Christian Bokhove:
Take the issue of control groups. Let’s imagine that you have a phonics program that you want to give to a group of children, the “intervention group”. If this group improves its literacy scores, we want to know whether that improvement would have happened anyway with a different or even no program at all.
A control group would not receive the intervention, so a comparison can be made. Sometimes a control group does nothing (a low bar), sometimes they just receive normal teaching (business as usual), and sometimes they are an active control, where they receive a different kind of task.
However, groups in a study need to be comparable. If, say, one group hasn’t been doing well at maths before the study starts, it might seem they didn’t do so well at the end of the study, but maybe that’s because they didn’t do well to begin with. It could also be that they actually have a better chance of improving.
In some studies, comparable groups are obtained by randomly assigning students to either the control group or the intervention group. For some, the gold standard is the fully randomised experiment, a randomised controlled trial.
Losing control
We saw an example of control group shenanigans in a recent . It was heralded by some as yet more proof of the importance of foundational skills. I focus especially on the control group which, contrary to the intervention year, was during 2020-21.
We all know that this year was particularly challenging. The control group is described as business as usual but, of course, teaching in the Covid years was anything but. It is not unlikely that such a group would not do very well, and that any program done one year after would be an improvement.
Let me be clear: the study and the materials are no doubt interesting. However, I don’t think we should ignore these obvious issues with the control group. My point is not that all research needs to be perfect, as this is not even possible, nor do we really know what perfect means.
However, if we do consider research, we should use similar standards for all the research we look at. It is not useful to downplay the limitations of research you like, but use similar limitations to completely destroy studies you don’t like. Especially if you want to call yourself evidence-informed.
Christian Bokhove is a professor in mathematics education at the University of Southampton and a specialist in research methodologies