Industries

Government is buying AI to automate the casework. The decisions that define public value are not casework.

AI in the public sector is procured as a digital programme: automate the transactions, speed up the queue, lift productivity. The harder question is what happens to policy judgement, accountability and public trust when those get a confident, generic answer.

Request a Strategic Briefing
In short

Most public bodies meet AI as an IT and procurement decision: a service-automation tool to clear casework, shorten queues and recover productivity. That work matters, and the evidence shows it is where almost all government AI effort is going. The trouble is that the decisions that actually define public value, the policy choice, the accountability position, where human judgement must stay, the call on public trust, are not transactions. Hand one of those to a commodity AI and it returns a fluent answer that was never anchored in the public it serves. In government a wrong, unchallenged AI-shaped decision is not a margin hit. It is a legitimacy failure. Gildoni installs the Havruta Methodology (formerly the Think Partner Methodology) into how public-sector leaders reason with AI, so they reason with it rather than defer to it. This is not a govtech platform. It is the judgement underneath the rollout.

01 · The pressure

AI arrives in government as a productivity programme

The pressure is real and it is everywhere. Budgets are tight, demand is rising, and AI is the obvious lever. So it gets scoped the way digital programmes are always scoped: a tool to automate transactions, clear casework and recover capacity, procured and run through the technology function.

The data backs this up. The OECD's review of two hundred government AI use cases found the effort heavily weighted toward doing the existing work faster, not toward sharpening the decisions behind it. And in the UK, the National Audit Office found the same gap between intent and practice: plenty of ambition, far less in production, and a persistent uncertainty about who is accountable when an AI-assisted decision goes wrong.

None of this is misplaced. Automating a backlog is worth doing. The point is narrower and sharper: the place AI enters the public sector is not the place its hardest decisions live.

Three numbers show where the effort goes, and where the trust sits.

57%vs45%

of government AI cases automate, streamline or tailor services, against 45% that enhance decision-making, and 30% that improve accountability.

37%/70%

of UK government bodies have deployed AI, while 70% are only piloting or planning it, and 67% cite uncertainty over legal liability as a barrier.

National Audit Office, Use of AI in Government, 2025
72%

of the UK public say laws and regulation would increase their comfort with AI, up from 62% in 2023. Trust, not speed, is the constraint.

Ada Lovelace Institute and Alan Turing Institute, national survey, 2025

The effort is going to the transactions. The trust is sitting with the decisions.

This new evidence shows that, for AI to be developed and deployed responsibly, it needs to take account of public expectations, concerns and experiences.
Octavia Field Reid, Associate Director, Ada Lovelace Institute. Source
02 · The diagnosis

Where the scoping quietly goes wrong

The misuse is not in automating casework. It is in letting the automation define the boundary of what AI is for.

A transaction has a right answer. A public-value decision has a defensible one. Whether to approve a claim faster is a process question. Whether the policy behind the claim is fair, who carries the accountability when it is wrong, where a human must stay in the loop, how the public will read the change: those are not faster-or-slower questions. They are judgement questions, and they are exactly the ones a commodity AI answers most confidently and least usefully.

The risk is asymmetric. In a commercial setting a generic AI answer costs margin. In the public sector the same generic answer, accepted without challenge, costs legitimacy. A citizen denied an appeal route, a cohort quietly disadvantaged by a model nobody interrogated, a decision no official can fully explain: these do not show up as a productivity line. They show up as a collapse in trust, and the survey evidence is unambiguous that trust is already the binding constraint.

It is worth being honest about the other side. The disciplines of good government, knowing your duty, owning the decision, deciding fairly under uncertainty, are old. AI has not rewritten them. It has stress-tested them, and found a great deal of public-sector reasoning being quietly outsourced to a machine that cannot know the public it serves.

03 · The turn

Automating the casework is the easy part

Here is the part the procurement frame leaves out. Every vendor and digital strategy now says the same thing: deploy AI, automate the queue, recover the capacity. Almost none of them say how a leadership team actually reasons through the policy, accountability and trust decisions that the automation sits inside.

That is the real gap, and it is not a tooling gap. Put a public-value question to a commodity AI and it will hand back a fluent, confident answer without ever asking whose interest it has missed. At the altitude where the decisions are largest, that is the Mirror Principle at its most expensive: if the reasoning going in was generic, the policy position coming out is generic, however polished it reads. A procurement framework can put AI in the building. It cannot make the judgement that goes into an AI-assisted policy decision any good. That is a different discipline.

PUBLIC VALUE PUBLIC TRUST Where legitimacy is decided POLICY & ACCOUNTABILITY Where the judgement lives SERVICE AUTOMATION / CASEWORK Where the AI programme enters AI procured to automate where most bodies stop the real decision rises THE TRUST GAP
The gap

AI enters low, procured to automate casework, which is where most public bodies stop. The decisions that define public value, policy and accountability, and above them public trust, sit far higher. The red is the distance between where AI is bought and where legitimacy is decided: automated, but not reasoned.

04 · The discipline

What the Havruta Methodology installs at leadership level

The Havruta Methodology is that discipline. It changes the default behaviour of the machine from agreeing with you to reasoning with you, which is precisely what a public-value decision needs.

Move 01

The Flip

The Flip puts the machine on the other side of the question. Instead of confirming the policy line, it argues against it: who does this disadvantage, what is the appeal route, what would have to be true for this to be unfair. The leadership team gets challenged before a citizen, a court or the press does the challenging.

Move 02

Ground Truth

Ground Truth keeps the reasoning anchored in the public this body actually serves, its real duties, populations and obligations, rather than in the generic policy language an AI produces by default. A public decision built on a plausible average is worse than no AI at all.

Move 03

Decision Velocity

And Decision Velocity lets the team decide at the speed the demand is rising, compressing the path from question to a defensible, explainable position without surrendering the judgement, or the accountability, to the machine.

The fuller account of how all of this works is on the methodology page.

05 · The boundary

What this is not

This is not a govtech platform and it is not a service-automation tool. It is not robotic process automation, not a case-management system, not AI-governance software, and it is not AI training or general AI literacy. The platforms and the frameworks are a separate market. This is the thinking underneath them.

Not a govtech platform Not a service-automation tool Not robotic process automation Not a case-management system Not AI-governance software Not AI training

It changes how the leadership team reasons about the public-value decisions it already owns: the policy choice, the accountability position, the call on where human judgement must stay, and the decision the public will have to trust.

06 · Where to begin

Where a leadership team starts

The methodology is installed along a ladder, and a leadership team enters at the rung that fits.

01

Most begin with the Eye-Opener Workshop, a half-day in which the team sees the shift on its own real work.

02

A leadership group embeds the practice through the Havruta programme, taking the discipline across the team.

03

A single high-stakes question, a policy position, an accountability model, a decision the public will have to trust, can be worked through Advisory Havruta.

The next altitude down

How a senior leader reasons with AI

For the leader who owns the public-value decision day to day, the role page takes the same discipline to the seat that carries the accountability. A Strategic Briefing is how to decide where to begin.

Go to the senior-leader page
Frequently asked questions

Leadership questions about AI in the public sector

Is AI in government a procurement decision or a leadership one?

It is bought and run as a procurement and digital programme, and most of the effort goes there: automating casework, clearing queues, recovering capacity. But the decisions AI then touches, the policy choice, who is accountable, where human judgement must stay, and whether the public will trust the result, are leadership decisions, not technical ones. Treating those as a by-product of a tool rollout is the mistake the evidence keeps surfacing. The procurement puts AI in the building. The judgement is a separate discipline.

Why is automating casework not enough?

Automating a backlog is worth doing, and the OECD's review shows it is where most government AI effort goes. But a transaction has a right answer and a public-value decision has a defensible one. Whether to process a claim faster is a process question. Whether the policy is fair, who answers when it is wrong, and how the public will read the change are judgement questions. AI answers those most confidently and least usefully, so leaving them to the automation frame is where the value, and the risk, quietly concentrate.

What is the real risk of a wrong AI-shaped decision in the public sector?

The risk is asymmetric. In a commercial setting a generic, unchallenged AI answer costs margin. In government the same answer costs legitimacy. A citizen denied an appeal route, a cohort disadvantaged by a model nobody interrogated, a decision no official can explain: these do not appear as a productivity line, they appear as a collapse in public trust. The survey evidence is clear that trust, not speed, is now the binding constraint, which is why the unexamined decision is the expensive one.

Is this AI-governance software or a govtech platform?

No. It is not a govtech platform, not robotic process automation, not a case-management system, not AI-governance software, and it does not touch your stack. Those address the framework, the controls and the throughput. This addresses the thinking: how a leadership team reasons through an AI-assisted policy or accountability decision so the answer is genuinely theirs, anchored in the public they serve, and stress-tested before a citizen or a court does it for them.

How should public-sector leaders reason with AI well?

Start by separating the transaction from the decision. Automate the casework, but treat the policy, accountability and trust calls as judgement, not output. Then install the reasoning discipline: make the AI argue the case against your position rather than confirm it, anchor it in the real public you serve rather than generic policy language, and decide at the speed demand is rising without handing the accountability to the machine. The frameworks tell you what to govern. This is how you reason while you do.

Where should we start?

With a Strategic Briefing, or with the Eye-Opener Workshop, where a leadership team sees the difference between instructing AI and reasoning with it on its own real work. From there the path depends on whether you are setting the policy and accountability position, embedding the practice across a leadership group, or working a single high-stakes decision the public will have to trust.

The casework can be automated. We install the reasoning underneath the public-value decision.