Executive Viewpoint: One AI Won’t Save You

The property/casualty insurance industry has made artificial intelligence a top strategic priority for three consecutive years. It has invested accordingly. And it has, by most objective measures, very little to show for it operationally.

Executive Summary

The carriers that are winning with AI didn’t find the right single solution. They stopped looking for one, asserts Bill Devine, co-founder and managing partner at Naitiv.

Here, he lays out three mistakes that carriers who don’t succeed with AI are making— searching for a single solution, trying to fit AI into workflows designed for a different era, and building governance models around individual AI agents instead of outcomes produced by cumulative actions of many agents and humans—and proposes a new playbook built for an AI future.

BCG’s 2024 global study found that only 7% of insurance carriers have successfully scaled AI beyond pilot programs.
Bain & Company’s 2025 Claims Maturity Assessment found that while 78% of P/C insurers use generative AI in some capacity, just 4% have scaled it enterprise-wide.
Guidewire’s analysis puts the pilot-to-production rate at 20-30% at best.

For an industry that treats AI as an existential strategic imperative, these numbers represent a fundamental failure of execution. And most of all there is no credible public example where AI today is meaningfully accretive to any carrier’s bottom line.

The question worth asking is why.

The technology works. The use cases are well-documented. The investment is real. So, what is actually broken?

Three things. Carriers are searching for a single solution in a world that requires many. They are trying to fit AI into workflows that were designed for a different era. And they are building agent-by-agent audit trails that will be completely useless the day a market conduct examiner or plaintiff’s attorney asks them to reconstruct the full decision chain behind a specific claims or underwriting outcome.

The architecture is wrong. The workflows are wrong. And the governance model most carriers are building will fail the test that matters most.

Looking for the Wrong Thing

The pattern visible across the carriers stuck in pilot mode is consistent. They are searching for a giant AI solution—one platform, one vendor, one system that can be deployed across the enterprise and deliver the transformation the board has been promised. That search is what is failing them.

It fails because it does not match how AI works at its best, at least today. The most capable AI systems are not monolithic. They are not single models doing everything. They are ecosystems of narrow specialists, each one purpose-built for a specific task, each one sourced from the partner best positioned to deliver it, coordinated through an orchestration layer that routes work intelligently and keeps humans accountable for the decisions that matter.

Andrew Ng, whose research on agentic AI architecture has shaped how the enterprise understands this space, put the constraint plainly at Davos in January 2026: “For many jobs, AI can only do 30-40% of the work now and for the foreseeable future.”

The implication is precise. No single AI system will handle everything. The 30-40% that AI can do today is distributed across dozens of discrete tasks, each one a candidate for a specialist agent, each one potentially sourced from a different provider who has built deep capability in that specific domain. A carrier looking for one solution to capture all of that value will not find it. A carrier willing to assemble the right specialists and orchestrate them will.

The enterprise technology industry has reached the same conclusion. Jensen Huang, CEO of Nvidia, has stated publicly (at ServiceNow’s Knowledge 2025 conference in May 2025) that the platform capable of orchestrating, governing and improving an ecosystem of specialized agents is where durable enterprise value resides, naming the orchestration layer—not any individual model—as the strategic asset. Bill McDermott, CEO of ServiceNow, has been equally direct. “AI doesn’t replace enterprise orchestration; it depends on it. It depends on governance. It depends on scale,” McDermott said during his company’s fourth-quarter 2025 earnings conference call.

The intelligence is not the asset. The infrastructure that coordinates it is.

The Model That Works

Think of your best underwriter not as someone whose job is at risk from AI but rather as a conductor who currently has no orchestra. They spend a significant portion of their time playing instruments themselves, gathering submission data, chasing missing information, normalizing exposures, checking against guidelines, drafting initial positions, because no specialized capability exists to handle those tasks.

Think of your best underwriter not as someone whose job is at risk from AI but rather as a conductor who currently has no orchestra.

The AI model that works gives that conductor an orchestra: a collection of best-in-class specialist agents, each sourced from the partner with the deepest capability in that domain, coordinated by a platform that manages the flow between agents and the underwriter.

What this looks like in practice is not a single vendor relationship. It is a deliberate ecosystem. A best-in-class agent for submission data extraction. A different specialist for geospatial and climate risk enrichment. Another for real-time loss history cross-referencing. Another for guideline compliance checking. Another for subrogation monitoring. Each one selected because it is the best available tool for that specific task. Each one connected to the others through an orchestration layer that the carrier controls.

None of these agents underwrite. Each handles a task that today consumes professional time without requiring professional judgment. Collectively, they produce an information environment so complete and current that the underwriter concentrates on what actually demands their expertise: the complex risk, the judgment call, the relationship.

The underwriter does not disappear. Their leverage multiplies.

The carriers capturing this value are not waiting for a single solution to arrive. They are building the ecosystem now. Leading firms deploying this model report productivity gains exceeding 30% per BCG. AIG, for example, has reported a 26% increase year-over-year in underwriting submissions processed in its Lexington business through AIG Assist and a 35% improvement in submit-to-bind ratio since rollout to Lexington Middle Market Property. Travelers similarly has cut the time it takes to register new business submissions from two hours to two minutes.

These are production results from carriers that changed their architecture, not their ambition.

The Playbook

The barriers between most carriers and this model are organizational. The playbook to address them is straightforward.

Start with the data. A multi-partner agent ecosystem depends on data that is clean, integrated and accessible in real time. Most carriers have the data accumulated over decades across policy, claims and third-party systems. What most lack is the integration architecture that makes it accessible to agents from multiple providers without a bespoke build for every connection.

A rigorous audit of data architecture is the mandatory first step: what exists, where it lives, what its quality is, and what is required to make it usable across a multi-partner deployment. A Novidea survey published in late 2025 found 95% of insurance professionals cite existing platform limitations as a top AI constraint, with integration leading. That is not a technology problem. It is an investment decision.

Map the workflows at the task level and be willing to reinvent them. This is where most carriers make a critical mistake. They document their existing workflows and ask where AI can be inserted. That approach produces marginal gains at best and disappointment at worst. Workflows designed for human execution over decades are built around the constraints of human capacity: the bottlenecks, the handoffs, the information gaps that people learned to work around. Wedging AI into those workflows preserves every structural inefficiency while adding a layer of technology complexity on top.

The right question is not where AI fits in the current workflow. It is what the workflow should look like if it were designed from scratch today with AI capabilities available from the start. That redesign will look different. Handoffs that exist because humans needed time to gather information disappear when an agent delivers it instantly. Review steps that exist because manual processes produced errors become lighter when agents enforce consistency. Escalation paths that were informal become explicit and documented.

“Handoffs that exist because humans needed time to gather information disappear when an agent delivers it instantly…. Escalation paths that were informal become explicit and documented.”

This is harder work than mapping the status quo, but it is the work that produces the 30% productivity gains rather than the 5% efficiency improvements. Task-level documentation that accompanies this redesign also answers the partner selection question directly. Once tasks are defined at this precision, the right specialist for each becomes clear.

Select the right orchestration platform before adding agents. The orchestration layer is the strategic asset, not any individual agent or any individual partner. Carriers should not be building this infrastructure themselves. They should be selecting a platform partner that has designed workflow management for AI agents at its core—one with native capabilities for routing work between specialists, maintaining workflow state across handoffs, governing agent behavior, and surfacing exceptions for human review.

The right platform partner is model-agnostic and vendor-neutral. It connects to the best specialist for each task regardless of who built it, and it keeps the routing logic, escalation criteria and governance rules under the carrier’s control.

Carriers that make this selection deliberately can add new specialist agents from any partner quickly and at low marginal cost. Carriers that skip this step and connect agents point-to-point create integration debt with every addition and end up owned by their vendors rather than owning their ecosystem.

Govern at two levels, not one. Most carriers thinking about AI governance today are focused on the first level: tracking the compliance and performance of each individual agent. Is this model producing accurate outputs? Is it operating within defined parameters? Is it exhibiting bias across protected classes? These are legitimate and necessary questions. Monitoring individual agent performance, maintaining model documentation, and demonstrating compliance with NAIC principles and applicable state guidance is the minimum requirement. Carriers that have not built this capability yet are already behind.

But the second level of governance is where the real exposure lives and where almost no carrier is prepared. The question is not whether each agent performed correctly in isolation. It is whether the cumulative effect of every agent and every human touchpoint across a transaction produced an outcome that is defensible end to end. A claims denial, a coverage determination, a declination—these outcomes are the product of a chain of decisions made by multiple agents from multiple partners and multiple humans across a workflow that may span days or weeks. When a market conduct examiner or a plaintiff’s attorney asks to reconstruct that chain, they will not accept an answer organized by vendor. They will want to know what happened to that specific transaction, in sequence, from first touch to final outcome, and why each step produced the result it did.

The right question is not where AI fits in the current workflow. It is what the workflow should look like if it were designed from scratch today with AI capabilities available from the start.

The multi-partner architecture that makes the agentic model powerful is precisely what makes this governance challenge hard. Each agent operates correctly within its own scope. Each platform logs what it sees. But no individual vendor’s audit trail spans the full transaction. If the carrier has not built a governance architecture that captures the complete decision chain across all agents, all platforms, all human interventions, in a single re-constructable record, that record does not exist.

The market conduct exam that is coming, and the class action that is also coming, will not care that the platforms were working individually. They will care whether the carrier can account for the outcome. Carriers that cannot will find themselves unable to defend decisions they made correctly, because they cannot prove how they made them.

Put a named executive in charge of the outcome, not the technology. The data work belongs to operations and technology together. The workflow documentation belongs to the business. The orchestration platform belongs to IT. Governance belongs to legal and compliance. The only way these workstreams converge is if a single empowered leader owns the result and has the authority to hold each function accountable for its contribution. Without that owner, the work fragments. The pilots continue. The gap to the carriers already scaling widens.

The Liability Is Already Accumulating

Every quarter spent searching for the one AI platform that does everything is a quarter not spent building the ecosystem that already works. The carriers in that 7% did not find a better single solution. They started by assembling specialists. The advantage they are building is not a technology lead. It is an operational lead, compounding every quarter in expense ratio, operations productivity, the LAE and risk selection quality.

But the more urgent argument is the one that belongs to legal and compliance, not the C-suite technology agenda. Every carrier that is deploying AI agents today without a cross-transaction governance architecture is accumulating a liability it cannot yet see. The decisions being made right now—by agents, by humans working alongside agents, by workflows that handoff between the two—are producing outcomes that will eventually be examined. Claims denials. Coverage determinations. Declinations.

The plaintiffs’ bar and the state insurance departments do not yet have the playbook for AI-assisted decision-making, but they are developing it. And when they arrive, the carriers that can reconstruct the complete decision chain for any specific transaction will be in a fundamentally different position than those that cannot.

The infrastructure required to support a multi-partner agentic model—clean data, redesigned workflows, a carrier-controlled orchestration platform, governance at both the agent and transaction level—is the same infrastructure that any more advanced AI future will require. Building it now is not a bet on a particular technology. It is a no-regret investment that delivers operational returns today and legal defensibility tomorrow.

The front-line insurance professional is not going anywhere. The question is whether leadership gives them the tools to do their job at a level the industry has never seen before or leaves them playing every instrument alone while the carriers that chose to act pull further ahead every quarter.

The playbook is clear. The parts are available. The liability is building. The decision to act is the only variable left.