Most insurers I speak to have a version of the same story. An AI pilot launches with real momentum — a compelling demo, executive sponsorship, a clear use case. Six months later, it has quietly stalled. The technology sits in a holding pattern, neither retired nor truly operational. The team moves on to the next proof of concept, and the cycle repeats.

This is not a technology failure. The models work. What fails — repeatedly and predictably — is everything that has to happen around the model to make it useful in a real operating environment. The gap between a compelling pilot and a foundational, production-grade workflow is not a model problem. It is a process, data, and determination problem.

Understanding why that gap exists — and how to close it — is the most important question in insurance AI right now.

Hear the full discussion on this topic with leaders from Everest, Sun Life, 360 Digital Immersion, and Cytora in the Insurtech Insights on-demand webinar, AI as a Partner, Not a Threat – How Do You Balance Automation and Human Expertise?

Why Pilots Succeed (and Why That’s the Problem)

Pilots are, almost by design, set up to succeed. They operate with clean data, controlled scope, and willing participants. The submission flows chosen for the demo are the tidy ones. The stakeholders in the room are the ones already converted. The business process being tested is simplified enough to be manageable.

The result is that the demo environment flatters the technology — and masks the complexity of the real operating environment waiting on the other side. When the pilot succeeds, it creates confidence that the hard part is done. It isn’t. The hard part hasn’t started yet.

Large language models are capable of far more than most organisations are currently asking of them. Even on the most unstructured, inconsistent broker submissions, the technology performs. The constraint is not the model — it is the infrastructure, process, and data that surrounds it. And that is precisely where the hard work lies.

The Three Real Barriers

In our experience, roughly 80% of deployment effort sits not in the AI itself, but in three areas that rarely feature in the demo.

Process re-engineering. AI does not slot neatly into existing workflows. It exposes them. Processes that have operated unchanged for a decade are suddenly visible in ways they weren’t before — their inefficiencies, their ambiguities, their undocumented judgment calls. ‘Human in the loop’ is easy to say in a presentation. Determining exactly when, where, and how the human intervenes, across every edge case, is a genuinely hard design problem. And many of those answers require changing how work has been done for years.

Legacy system integration. Most carriers have no choice but to work with the systems they have. API reliability, response latency, and data availability are not AI problems in origin — but they become AI blockers in practice. The path forward is not to wait for infrastructure transformation. It is to design intelligently around the constraints that exist today, building workflows that can operate within the hand you are dealt.

Data quality. AI does not hide bad data. It surfaces it — often at the worst possible moment. Many organisations discover mid-deployment that data they believed was clean and complete is neither. Resolving this is slow, unglamorous work. It is also non-negotiable. If the underlying data is wrong, the end user never experiences the full benefit of the solution, regardless of how capable the model is.

None of these are insurmountable. But none of them resolve themselves. They require time, iteration, and the willingness to go into the places that are uncomfortable to look at.

The Right Mental Model: Crawl, Then Run

The most effective approach we have found is to start with the entry-level value and build from there — deliberately, systematically, without skipping steps.

Step one is digitising the intake: turning unstructured documents into decision-ready data. A broker submission, an attending physician statement, a claims notification — these arrive as documents. Until they become structured data, you cannot execute a workflow against them. The good news is that this step is highly achievable without deep legacy integration, and it delivers immediate, visible value. Large language models are genuinely impressive at taking a series of hard-to-decipher, inconsistently formatted broker submissions and turning them into something complete, structured, and underwriter-ready.

Step two is workflow execution: appetite matching, triage, broker lookups, routing, decisioning. This is where the real organisational complexity lives. These steps require context that is inherent to a specific carrier — what does your appetite actually mean? Where are the shades of grey? What distinguishes an in-appetite risk from an out-of-appetite one in the cases that aren’t obvious? These are not technological questions. They are disambiguation questions. They require conversations, explicit definition, and the painstaking work of turning judgment — which lives in people’s heads — into something that can be codified.

The temptation is to skip to step two without having properly completed step one, or to try to tackle both simultaneously. This is where many deployments stall. The foundations have to be right before the advanced use cases can land.

Grit as a Competitive Advantage

What I have observed in the carriers who successfully move from pilot to production is not that they have better models, larger budgets, or more advanced technology. It is that they have the determination to do the unglamorous work.

This means partners — internal and external — who are genuinely enchanted by the problem, not just the demo. People who will push through process ambiguity rather than route around it. Teams who treat the friction of legacy integration as a puzzle to solve, not a reason to pause. Organisations that accept the iterative, sometimes frustrating reality of operationalisation as the work itself, not as an obstacle to the work.

The reward for that persistence is significant. When AI moves from a tool that assists individual tasks to infrastructure that is foundational in an end-to-end workflow, the benefits compound. Speed of submission clearance. More sophisticated triage. Consistent, data-driven risk assessment. Faster responses to brokers and clients. The ability to process far more, without a proportional increase in resource. These are not incremental gains. They are structural advantages.

The Next Chapter Is Operational

The pilot graveyards that exist inside most carriers are not evidence that AI does not work. They are unfinished stories — proof that the technology outpaced the infrastructure around it. The next chapter is always an operational one, and writing it is squarely within the organisation’s hands.

My advice to transformation leaders is straightforward: stop commissioning new pilots and start commissioning the hard, iterative, deeply necessary work of making the last one foundational. Focus on one or two use cases where AI can play a meaningful role. Get the intake digitised. Map the process properly. Fix the data. Push through the legacy integration. And choose partners who have the grit and intent to work with you through all of it — not just until the demo looks good.

The technology is ready. The question is whether the organisation is willing to do the work to meet it.