If government is serious about digital transformation and AI, it needs a fundamental shift in mindset

Evaluation should be hardwired into the design of any digital project, and tight budgets should strengthen the argument for finding out what works

Photo: Adobe Stock

By Patrick King

09 Oct 2024

The new government has been bullish on the potential of AI to transform public services, squaring the circle of ever-growing demand and ever-tightening departmental budgets.

Darren Jones, the new chief secretary to the Treasury, previously called for a government which lets “1,000 AI pilots” flourish and £32m of funding has already been committed to nearly 100 AI projects intended to “boost productivity” and “improve public services”. On average, that is less than £350,000 per project: meaning cash will be spread very thin.

But unless the government has robust plans to evaluate which of these projects have been successful, the payoff will be equally minimal. A recent poll of officials found that almost half believe their department does not effectively measure the progress or success of its digital and data initiatives.

With government betting on so many horses, it’s crucial departments take the time to work out which digital projects have been most effective, so the Treasury can invest in scaling them further. Spending on these projects otherwise risks being closer to gambling than investment. As Reform research has previously shown, evaluation in Whitehall is largely an afterthought and left to a set of “proactive amateurs”, rather than seen as crucial to improving outcomes.

Remarkably, of the 108 most complex and significant projects managed by government in 2019, only nine were fully evaluated. Government has no real idea whether billions of pounds of spending are making any difference. And these are only the largest projects, there are many more smaller ones. Tom Adeyoola, the co-author of Labour’s start-up review, told a recent Reform event that government has “more AI pilots than Heathrow”. Similarly, the risk is that AI pilots become one-off vanity projects for the teams that initiate them, rather than a tool to kick-start reform of services.

Departments need to embed the practice of evaluation in their everyday work, particularly for digital and data projects. To cure "pilotitis" – projects starting and stopping without achieving lasting change – we must start caring about evaluation.

Singapore’s Open Government Products team, for example, maintains a live dashboard of the spending on, and outcomes from, digital services. In Canada, funding from the Treasury is contingent on sign-off from a department’s “head of evaluation” that there is a strong evidence base for a proposed initiative or that there are plans in place to evaluate effectiveness. All government spending in Canada must be periodically evaluated.

"The risk is that AI pilots become one-off vanity projects for the teams that initiate them, rather than a tool to kick-start reform of services"

By contrast, in the UK, our evaluation protocols are piecemeal and often decided by arbitrary spending cut-offs. The small sums set aside to carry out these evaluations are also inadequate. We need a fundamental shift in mindset. In future, government should consider which pilots have the greatest implications for public sector productivity, and ensure it has plans in place to assess their impact.

Though many digital projects are individually low-cost, on aggregate they represent vast amounts of annual spending, often with the potential to save government multiples of its initial investment. As Gareth Davies, head of the NAO has said, the technology “already exists to transform service delivery, reduce costs and improve the user experience” – releasing “billions” for other priorities.

If government is serious about digital transformation, and the potential of AI, this should be reflected in evaluation occurring on a regular basis, even for small pilots. Tight budgets should strengthen the argument for finding out what works, not for throwing evaluation on the scrap heap.

It should be hardwired into the design of any digital project, and spending should be made conditional on departments carrying out and transparently publishing evaluations of their existing pilots. To build digitally-enabled public services, we first need a more evidence-led Whitehall.

Patrick King is a senior researcher for the Reform think tank

Read the most recent articles written by Patrick King - We need a higher bar: How, and why, we should overhaul the Fast Stream