Amid growing concerns about the impact of the Covid pandemic on school attendance, the Department for Education turned to data as a way to help it improve the situation.
Before the pandemic, the department had only published termly datasets, with a time lag of around six months, making it difficult to address any immediate issues and identify emerging patterns. During the pandemic, the government created the Educational Setting Status form, which schools used to provide information about attendance. DfE said this gave it valuable insights but also created a burden for schools. Last year the government decided to develop a new system of automated, twice-daily data collection and tasked a small team with speedily turning that data into an accessible, fortnightly picture of attendance across the country.
This meant the team needed to work on around 20 new data publications per year. This is on top of termly census absence figures, which are still being produced as only 85% of schools are signed up to the automated twice-daily stats collection system.
The school census stats team delivered this new way of working in less than six months in time for the start of the school year in September 2022. It has published fortnightly ever since, allowing policymakers to identify trends more easily and respond quickly to issues that arise.
In recognition of this achievement, and the quality of the publications, the team received the Royal Statistical Society’s Campion Award for Excellence in Official Statistics, which is presented in partnership with the UK Statistics Authority and CSW.
Judges said the DfE project was an example of agile, useful data provision and an exemplar for others to follow. They also praised the team’s proactive response and their efforts to ensure transparency.
CSW caught up with Sean Gibson, who heads up the school census stats team, and Gemma Selby, who leads on school attendance data, to find out how they did it, what the biggest challenges were and how it felt to win the award.
The team publishes a range of publications on topics such as pupil numbers, free school meals, special educational needs and exclusions.
When DfE began collecting the automated data on pupil attendance in spring 2022, it challenged the team to deliver fortnightly publications by the start of the upcoming school year. These publications would summarise attendance across the country and be published in a way that would be accessible and transparent for anyone reading it, which could include schools, local authorities, researchers and lots of other users.
The team received twice-daily updates on whether pupils in schools that had signed up were present or absent, and, if absent, the reason given. This meant millions of rows of data every day, initially representing around 4.5 million pupils and now around 6 million.
The volume of data was the key challenge, Gibson says. “In one day, one session, you couldn’t open that in Excel for example,” he explains.
“It was huge and we had a relatively fast turnaround time in terms of the data coming in and wanting to publish that and show what the picture was as soon as possible,” he says.
“If something goes wrong with that volume of data – if the server times out or we lose connection or if we realise we need to change something – we’re often dealing with quite long periods of time waiting for the data to run.”
Gibson says the team had to think ahead and optimise their working to avoid losing a significant amount of time waiting for the data to be processed.
Selby led on the project but would work with team leader Gibson to come up with solutions on coding, designing the dashboard and quality assurance.
But the “double act”, as Gibson describes himself and Selby, also had a lot of help from across DfE.
“We’re the stats publishers, we’re the ones that do the public-facing stats,” he says. “[But] to get the data and in the shape that we could use it has been a heck of an effort across data colleagues. We’ve had data modellers, data architects, data engineers – data roles that I’m not sure we knew existed three or four years ago.”
The team also had help from project managers and policy colleagues, Selby explains. “On the dashboard optimisation side, within our division we’ve got a really helpful stats development team that have produced loads of great templates to work from,” she says. “They’ve been super helpful in making sure that we can retain stability and performance even when we’ve started getting up to a full year’s worth of data by different breakdowns,” she says.
The publication team has remained a small, close-knit unit but Gibson says big increases in data staff across the department have made the jump up to fortnightly publications possible.
“There’s been a huge undertaking within the department on the data side, much greater resource there,” he says. “I think if it was coming down to just a couple of people trying to do this, we’d still be trying to deal with the first few extracts that came in, trying to understand how to model them and engineer them. The expertise that’s gone into it is what has made it possible.”
Selby adds: “We’ve had so much help across the department. The co-ordinated effort helped us get off to a really quick start, turning around figures in September when we needed to.”
Campion Award judges praised the team’s “proactive response to the need for better data on school attendance, and related safeguarding issues, which gained prominence during the coronavirus pandemic”.
Selby says the team’s longstanding experience publishing absence data based on the school census also helped.
“We already had a really well established methodology,” she says. “So we stuck quite closely to that methodology which made trying to build all our code for the new publication quite easy.”
Selby says building on what they knew had worked previously enabled the team to provide consistency in the presentation of their publications – contributing to what the award judges described as praiseworthy “efforts to ensure transparency so the findings could be communicated to a broad audience”.
“Having already had the census absence publications, we’ve got a really good back series for that and have had some really good user feedback and users are familiar with all of that,” she says.
“We’ve stuck to DfE templates and tried to retain consistency across other templates within our division and made sure that we’re in-keeping with other products that are built across the wider department.
“Where we’ve had to deviate from what we’ve done previously in the census publication methodology, we’ve tried to be as clear as possible with that. And with it being a slightly new way of working, with this more timely data, there’s been a couple of data quality issues that have arisen and [so we’ve been] keeping users abreast of that as we’re finding it, making sure that it’s clearly outlined in all our publication methodology.”
But the team says the timeliness of the publications was the biggest achievement.
“Previously, you’d wait till the end of the term, then the next term the data would be collected and it would take us a couple of months to turn that data around and put it into a publication,” Gibson says.
“Whereas now we’ll be publishing data from a week and a half ago. I think that’s the big win and probably what impressed [the judges] most is having a robust system that can do that quickly.”
“It’s intensive,” he adds. “Sometimes it doesn’t feel like you’re getting a break at all. You publish it and then you’re thinking about the next one straight away. It’s a new way of working for us.”
The judges also called the team “an exemplar for others to follow”. What advice would they give to other officials working on similar projects?
“My advice would be to be flexible,” says Gibson, explaining that quirks in the data that “previously you got away with” in a termly release are now more noticeable. “You end up seeing unusual things in the data and you have to react and be flexible around those things,” he says.
“No two local authorities seem to have had the same term dates all year [as to] when they started, when they went off for Christmas, when there were inset days. Trying to deal with that in the publication and still display a representative figure means, by and large, we’ve got a methodology that we can follow, but we’ve needed to flex on that a little bit and to not get too hung up on what we’ve done before.”
Selby adds: “One of the things that it’s taught me, even though we’ve been working from an established methodology, is you can still think fresh about how you’re doing things.
“With the new timeliness of the data, we’ve been able to interrogate things much more closely. We’ve been able to see patterns that we wouldn’t have been able to see otherwise.
“We’ve been able to see stuff like pupils being on religious leave for Eid, whereas previously in the census data it would have been masked out. So you’re able to get a lot better insight if you’re able to follow what the data is showing you rather than following all of your previous methods.”
Winning the Campion Award – which is named after Sir Harry Campion, the first director of the Central Statistical Office, the forerunner to the Office for National Statistics – “still hasn’t quite sunk in”, Selby says.
The project also won DfE the accolade of data transformation project of the year at the British Data Awards. But Gibson says the Campion Award “means the most” to his team as it is “more about the published stats and the transparency and getting the data out there, which is our big part of it”.
“Hearing what other people had done at that awards was like, ‘Oh my God, that’s amazing’,” he says. “And then you go and stand next to them on the stage with yours and it’s like, ‘Yeah, this is pretty cool’.
You can start to celebrate it properly and acknowledge what we’ve done. Other stats colleagues who’ve popped up and said, ‘It’s a big deal’ makes you realise [what you’ve achieved].”
Getting feedback from those who have used the publications has also been gratifying, Selby says. “A lot of people in the sector have said, ‘This is unbelievable, we’re able to look at our attendance’. You walk up to someone at the school gates and they can say, ‘I’ve been looking at your tool recently, this is really cool’.”
This is one of the best things about working at DfE, Gibson adds. “I’ve been in education for the last six years and I’m not planning on going anywhere else. It’s a good department for seeing impact. You get to see a lot of what you’re doing on the ground.”
The problem-solving aspect of the job is also very rewarding, Selby says. “For people in my family who don’t really understand what I do in terms of stats, I’ll say I’m trying to solve a puzzle in French all day.
That’s how it feels to try and code things sometimes. But I find it incredibly satisfying when you actually get something that works and something that’s successful. So building the dashboard was a hugely satisfying thing for me to do. Finding out that people like it has been the icing on the cake.”