Ebook

The CFO’s AI readiness report: Part 4

15 mins

AuthorPayhawk Editorial Team

Read time

15 mins

PublishedMay 20, 2026

Payhawk: The CFO AI Readiness Report. Part 4.

Quick summary

Finance teams that scale AI share three things: they know what AI is permitted to do, they have data their outputs can be tied back to, and they have named someone accountable for the outcome. The reality in the market? Most self-identified AI leaders are missing at least one of these. This final report in our global four-part series turns that finding into a concrete operating plan: A 30/60/90 sequencing guide across four distinct scaling constraints. If your AI program has stalled, this report shows you exactly where and what to fix first.

The moment every finance AI program hits
What this series has established
The reframe: two debts, one decision
Your posture dictates your first move
The 30/60/90 sequencing plan
The aha: Finance AI is a legitimacy problem
CFOs: Your biggest takeaway
Next steps

Add us as a preferred source

Get fresh finance & AI insights, monthly.

Unsubscribe anytime.

By submitting this form, you agree to receive emails about our products and services per our Privacy Policy.

Foreword: Konstantin Dzhengozov, CFO and Co-founder, Payhawk.

“Four reports in, and this is the one I've been looking forward to the most! The first three were about diagnosis. This (the final in the series) is about what you actually do next. If you've followed the series, you already know more about your AI program than most finance leaders. Now let's make it useful.

When we started this research, we expected adoption to be the story… which teams were moving fast, which were falling behind, and why. That's the narrative most AI research tells. But the data had other ideas. Across 1,520 senior finance and business leaders, the teams that were struggling weren't struggling because they hadn't adopted AI. They were struggling because they'd built AI capability without building the operating conditions that make it defensible. Rules that were never written down. Data that couldn't be reconciled.

In most functions, that gap's an inconvenience. In finance, it's a serious problem. Finance teams don't get to say 'we'll sort the governance later.' Approvals, audit trails, and accountability aren't features you add after the fact. They're the preconditions for any of this to hold. The first three reports showed where that tension lives across the market. This one shows what to do about it. If you've read the series, you know which posture your organization sits in. This report gives you the sequence from there. Use it.”

CFO and co-founder, Payhawk

Konstantin Dzhengozov

"You already know more about your AI program than most finance leaders. Now let's make it useful!"

At some point in almost every finance AI program, the pilot stops being the reference point. The business has moved on, expectations have grown, and AI is being asked to do something that actually matters: an approval, a journal entry, a vendor payment. That's when the gap between what the system can do and what the organization can defend starts to show.

The moment every finance AI program hits

Imagine the situation many finance teams in our research described: a financial controller at a 600-person manufacturer whose team has just run a strong AI pilot, flagging invoice exceptions, suggesting GL coding, and cutting review time by 40%. Her CFO loved it. And they got the budget to expand.

Six months later, the rollout stalled. An internal audit asked a question nobody could cleanly answer: What, exactly, is this system permitted to approve? And who is accountable when it gets it wrong?

The team had built capability, but they hadn’t built the layer around the capability that made it defensible. The AI kept running at the edges, where ambiguity was tolerable, which also meant where it was least valuable.

This is the moment most finance AI program hit. The pilot works. The business gets excited. Then something — an audit question, an escalation nobody owns, an output that will not reconcile — stops the rollout cold.

Tatiana Okhotina, CFO at Token.io, sums it up best:

Everything must be auditable.

The quote is just four words, but the implication for AI is massive: if you can't trace it, you can't scale it.

What this series has established

The first three reports in this series built a clear picture.

Part 1 showed that AI maturity in finance is structurally uneven, shaped by industry, company size, and operating context. The market isn’t "early." It is split. Some teams are already scaling AI into core workflows; a large middle is stuck converting pilots into production; a tail is still experimenting.

Part 2 identified the five conditions that determine whether AI can scale in finance: execution measures, minimum governance rules, skills, budget, and usable data. Among self-identified AI leaders, only 26% have all five strongly in place simultaneously. Skills and budget tend to come first. Minimum rules and data readiness lag behind.

Part 3 showed that leaders themselves split into six distinct operating postures, each with different strengths, different gaps, and a different bottleneck. Execution-led implementers are shipping fast but have no governance layer. Governance-forward scalers have strong rules but weak data foundations. Scaled adopters have the full stack but need to harden it for broader delegation.

CFOs: Only 26% of finance leaders have all five AI scaling conditions in place

This final report answers the question all three reports were building toward: given where you sit, what do you actually do first?

The reframe: two debts, one decision

Here’s the insight that changes how most CFOs see their AI program.

Most finance teams diagnose their AI stall as a tooling problem, a skills problem, or a change-management problem. The data from 1,520 senior respondents says otherwise. The dominant constraint (across every segment, every company size, every level of self-reported maturity) is one of two structural deficits.

Rules debt accumulates when adoption outpaces governance. AI is active. Work is moving. But the organization cannot state what the AI is permitted to do, when it must escalate, who owns the outcome, or what the audit trail shows. Rules debt is invisible during pilots. It surfaces the moment AI touches anything that demands accountability: approvals, policy enforcement, close workflows, and payments.

Data debt accumulates when governance is in place, but the data environment cannot support reliable outputs. Master data is inconsistent across entities. Integration paths break at the edges. AI outputs cannot be reconciled back to the system of record. Teams in this position have often done real governance work and still find that AI cannot move beyond the assistant layer because the foundation underneath it is not solid enough to act on.

Both debts behave like financial debt in one important way: they accumulate behind-the-scenes and become expensive at the worst possible moment.

What AI leaders have in place, and what's missing:

Skills and tools: 78% strongly in place
Budget committed: 69%
Execution measures: 64%
Data usable: 61%
Minimum rules: 55%

Skills and budget come first. Of the five conditions, rules and data lag the furthest behind, and the data in this report shows they're the two that hold organizations back from scaling once skills, budget, and execution are in place.

Among AI leaders, 32% have strong skills but no minimum rules in place. A further 22% have strong execution measures and still no minimum rules. In both cases, the capability exists. The permission structure around it doesn't.

Konstantin Dzhengozov, CFO and Co-founder at Payhawk, explains:

Rules debt and data debt aren’t abstract problems. They’re the reason AI feels fast in a pilot and slow in production. In finance, you cannot borrow against governance. You have to build it before you need it.

The decision CFOs now face is simple: Which debt are you paying down first? Which workflows to prioritize, which tools to deploy, how fast to expand: all of it follows from that answer.

This is the operating-model insight that most AI maturity frameworks miss. They measure adoption. They track use-case coverage. They benchmark tools and skills. None of that tells a CFO what is actually blocking their specific program. The debt framing does.

Figure 7. AI leaders mapped against governance and data foundations. Six operating postures plotted by share 'strongly agreeing' minimum rules are in place (horizontal axis) and share 'strongly agreeing' data is ready for AI (vertical axis). Quadrants reveal the dominant constraint by posture.

Your posture dictates your first move

The six operating postures identified in Part 3 each carry a predictable constraint.

1. Execution-led implementers (16%) are shipping fast with strong skills, but minimum governance rules are absent. Their constraint is rules debt. Adding use cases on top of a missing control layer increases exposure faster than it increases value. The first move is governance, not more capability.

2. Governance-forward scalers (13.8%) have rules in place and execution discipline, but data readiness is weak. Their constraint is data debt. More policy refinement won't unlock scale. The first move is master data ownership and integration reliability, before pushing AI deeper into operational workflows.

3. Control-first planners (11.6%) have skills, budget, and data in reasonable shape, but execution measures aren't yet deployed. Their constraint is conversion. The ingredients exist. The first move is choosing one workflow and deploying deliberately, rather than continuing to build readiness in the abstract.

4. Scaled adopters (26.9%) have all five conditions strongly in place. Their constraint is industrialization: hardening approval corridors, exception handling, and audit trails so that delegation stays defensible as it expands.

5. Incremental improvers (17.5%) are making real progress but unevenly. Some conditions are strong, others lag. AI is moving forward, but hasn't compounded into a coherent operating model. Their constraint is consistency: identify which condition is weakest and close that gap before expanding the scope further.

6. Agent-first, control-later (14.1%) teams have strong intent but limited operating discipline across multiple conditions. Their path is narrower. Choose one bounded workflow, build the smallest viable operating stack for it, and prove the model before expanding scope. Broad transformation programs from this position tend to produce months of activity without a single workflow actually scaling.

Not sure of your posture or variant? Find the variant you need below to see which debt is costing you the most right now.

The 30/60/90 sequencing plan

Each variant below starts with a one-sentence situation description. If it sounds like where you are, that's your starting point.

Variant 1: Rules-debt first

This is right for you if your situation looks like this: AI is generating value, but your organization cannot defend what it is doing.

Days 1–30: Define the minimum control layer for one live workflow

Choose the workflow where AI is already active and most likely to expand. For that workflow, write down four things, and only four:

What the AI is permitted to do, and at what threshold it must stop and escalate
Who receives escalated exceptions, and how quickly they must respond
What gets logged by default
Who is accountable for the outcome, as a named role rather than a team

This is a one-page operating specification for one workflow. Keep it short enough to enforce and specific enough to audit.

Days 31–60: Run the workflow under the new rules and document what breaks

The goal isn’t a perfect run. The goal is to surface the gaps, think: unclear escalation paths, log entries that are missing, and ownership questions that nobody can answer cleanly. Every exception that surfaces is data. By day 60, you should be able to reconstruct exactly what the AI did, what it escalated, who resolved it, and what the record shows.

Days 61–90: Apply the same framework to a second workflow

Don’t build new governance from scratch for the second workflow. Use the same minimum-rules template, adapted for the new context. The goal is a repeatable pattern, not a one-off design. By day 90: two governed workflows, a functional escalation model, and a log structure that can survive external scrutiny.

The measure of success: A named auditor or regulator asks: "What was the AI permitted to do in this workflow, and who owned the outcome?" And your team answers in under two minutes.

Variant 2: Data-debt first

This is right for you if your situation looks like this: Your governance is solid, but your AI outputs are not trusted enough to act on.

Days 1–30: Map the specific data gaps blocking one workflow

Choose the workflow where AI should, in theory, already be scaling, but where outputs aren’t reliable enough to use confidently. Then answer these questions for that workflow only:

Which master data records does this workflow depend on: vendors, GL mappings, cost centers, and/or entity structures?
Where do those records break or diverge across systems or entities?
What would an auditor need to see to verify that an AI output from this workflow is correct?

This is a targeted data audit for a specific workflow. The question is: what gaps prevent AI outputs in this workflow from being reconcilable?

Days 31–60: Fix the data foundations for that workflow

Establish a single owner for each master data set that the workflow depends on. Standardize the structures (categories, entity mappings, GL codes) for this workflow specifically. Build or validate the integration path so outputs can be traced back to a system of record.

Resist the instinct to tackle data quality broadly. Broad data programs rarely surface the specific improvements that unblock a live workflow fast enough to matter. Fix what this workflow needs, then expand.

Days 61–90: Validate that outputs are now reconcilable, then scale

The outputs should now be independently verifiable: reconciled to source data, explainable to an auditor, and owned by a named individual. If that holds, you have proof that your governance model can carry AI into production workflows. Apply the same data-fix approach to the next workflow in sequence.

The measure of success: An AI output from this workflow can be traced back to a source of truth in under five minutes, by someone who was not involved in producing it.

Variant 3: Both debts present

This is right for you if your situation looks like this: AI is active, but neither the governance nor the data foundations can support expansion into core workflows.

This is the posture where the instinct to fix everything at once is strongest, and the most expensive to act on. Broad programs from this starting point tend to produce months of policy work and infrastructure investment without a single workflow actually scaling.

Days 1–30: Choose one bounded workflow and define its minimum viable operating stack

The selection criterion: find the workflow where the cost of an error is manageable, and the data is cleanest. Choose the one where you can prove the model fastest, rather than the one that looks most valuable on paper.

For that workflow only, define on a single page the minimum rules (from Variant 1) and the minimum data requirements. What master data, integration paths, and reconciliation logic does this specific workflow need? Nothing more.

Days 31–60: Deploy, run, and document what breaks

Run it. Log everything. Fix what surfaces. The goal is one working example of a governed, data-supported workflow you can defend under audit, explain to your board, and use as the template for what comes next.

Days 61–90: Use what you learned to sequence your next fix

After 60 days of running the first workflow, you will know which constraint was harder than expected. Was it the rules that were difficult to enforce? Or the data that was more fragmented than anticipated? That answer tells you which debt to address next, and gives you a concrete basis for prioritizing investment rather than guessing.

The measure of success: One workflow operating under minimum rules, with data foundations that support reconcilable outputs. Not five workflows. One, cleanly.

Variant 4: Industrialization

This is right for you if your situation looks like this: AI is scaling, but the operating model needs hardening before you can safely expand delegation further.

Days 1–30: Audit governance drift across all active workflows

As workflows multiply, rules that were designed for one context get applied informally to others. Ownership structures that were clear at the start become ambiguous as teams change. Run a structured review across every active AI workflow: Are permission boundaries still accurate? Are escalation paths still functioning? Is the audit trail complete and consistent across entities?

Days 31–60: Fix exception handling and escalation gaps

The failure mode at this stage is edge cases falling into gaps in the escalation logic: unusual transactions, cross-entity exceptions, policy boundary cases. These gaps are invisible under normal conditions and expensive when they surface during close or under audit. Map the exception types that have arisen across active workflows. Confirm each has an owner, a resolution path, and a log entry.

Days 61–90: Extend governance to the next tier of workflows

With a hardened operating model in place, expand AI delegation to the next category of workflows, typically those that are more complex, higher-value, or closer to core financial controls. Each new workflow should meet the same auditability standard as the first.

The measure of success: Every AI-influenced decision across all active workflows can be traced, explained, and owned, including workflows added after the original deployment.

The aha: Finance AI is a legitimacy problem

Most CFOs enter this series thinking their AI program has a tooling problem, a change management problem, or a data infrastructure problem that will eventually get fixed. The data points to something more specific and solvable.

In finance, AI must earn three kinds of legitimacy before it scales.

Operational legitimacy means AI can do real work in processes that matter. Most teams achieve this early, which is why pilots produce genuine results.

Institutional legitimacy means the organization can defend AI's use under audit, policy, and accountability scrutiny. This is where rules debt becomes the gating factor: a governance gap that better tooling alone cannot close.

Data legitimacy means the organization can trust the data environment AI relies on. This is where data debt becomes the gating factor. Strong governance built on fragmented data produces AI that is governed but not operational.

Read against the rest of this series: institutional legitimacy is rules debt's twin, and data legitimacy is data debt's. Both describe the same gap from the inside out: the operating conditions that determine whether AI can be defended, not just deployed.

Operational legitimacy comes first, which is why pilots are easy. Institutional and data legitimacy take longer and cannot be skipped, which is why scale is hard.

The teams that close this gap fastest are the ones that identify which legitimacy is missing and build it directly, rather than adding more tools or running more pilots on top of the same gap.

That coordination work — connecting rules, data, and accountability across a workflow — is what finance orchestration delivers in practice. It's why orchestration, not adoption, is the operating-model frontier.

Konstantin adds:

The CFOs who scale AI furthest won’t be the ones who moved fastest. They’ll be the ones who knew which constraint to fix, fixed it first, and expanded only when the operating model could carry it. That is not caution. That is how you build something that compounds.

CFOs: Your biggest takeaway

Across four reports and 1,520 respondents, one finding holds across every segment, every company size, every level of self-reported maturity: the difference between AI that scales and AI that stalls exists in the operating conditions around the technology, not in the technology itself.

The CFOs who are furthest ahead share one habit: before expanding AI into a new workflow, they ask whether they can defend it under scrutiny, alongside whether it will produce results.

That question, "Can we defend this?", is the practical test for both debts. If you cannot clearly state what AI is permitted to do and who owns the outcome, you have rules debt. If you cannot reconcile the output back to a source of truth, you have data debt. If you can do both, you are ready to scale.

The sequencing matters. Paying down the wrong debt first, or attempting to fix everything simultaneously without a bounded starting point, is the pattern behind most stalled finance AI program. The organizations that pay down the right debt, in the right order, create compounding returns: each governed workflow makes the next one easier to deploy, harder to challenge, and more valuable than the last.

That’s what orchestration looks like in practice; not a single deployment, but a finance operating model that gets stronger with every addition.

Next steps

Find your posture. Then fix the one thing.

The diagnostic below maps directly to the four variants in this report. If you're not sure which constraint is holding your program back, start here:

Pick one workflow where AI is already active or planned
Question 1: Can you state what AI is permitted to do, and who owns the outcome? If no: rules debt
Question 2: Can you reconcile AI outputs back to a system of record? If no: data debt
If both answers are No: Variant 3. If both answers are Yes: Variant 4

This is exactly the sequence Farah Rouassi, VP Finance and Strategic Partnerships at Paradox, went through when she brought AI into her finance team with Payhawk.

Travel bookings that used to take over an hour of back-and-forth now take four minutes. Receipt chasing that once consumed two full days a month now happens automatically. The AI runs inside governed workflows, so every transaction, approval, and submission follows the rules her team set. With Payhawk, Paradox has achieved:

93% faster travel bookings, from over an hour to four minutes
Zero days spent chasing receipts, down from two days a month
One platform for cards, invoices, travel, and approvals, all governed by rules the finance team sets

Farah says:

My financial controller agent sends all the reminders during the month — at the end of the month, I don't have to send any messages. No stress, no patience needed.

That's what it looks like when the operating conditions are right: not just AI that works, but AI that holds up, scales, and gives finance time back for the work that actually matters.

If you want to see how it works in practice, explore how Payhawk applies AI inside spend management, accounts payable, and travel — with controls, audit trails, and accountability built in from the start. Book a demo with one of the Payhawk team.

Scale smarter with powerful AI agent support

Book a demo

This is the fourth and final report in the CFO AI Readiness series, based on original research conducted with IResearch across 1,520 senior finance and business leaders globally.

Methodology: Interviews conducted across DACH, Spain, France, Benelux, the UK & Ireland, and the United States. Respondents: C-suite, VPs, Directors, and senior individual contributors across Finance, Accounting, Sales, HR, and Procurement. Industries: Services, Digital, Manufacturing, Healthcare, Education and Non-profit, and B2C. Company size: 50–100 FTE, 101–250 FTE, 251–500 FTE, 501–1,000 FTE, and 1,000+ FTE. The five scaling requirements were each rated on a 1–7 agreement scale. "Strongly in place" refers to scores of 6–7.