The uncomfortable truth about - GenAI transformations
Many teams are bolting a chatbot on top of a data mess and hoping for magic. Then everyone wonders why the ROI is soft, the risk team is nervous, and the board is unconvinced.
If you want dependable answers, not just impressive demos, generative AI needs an old-school companion: traditional machine learning and rules-driven systems, wired in properly. That's the "best friends" relationship we should be designing for.
GenAI is brilliant at words, terrible at ownership
Large Language Models (LLMs) are astonishing at pattern-matching language. They summarise, rewrite, translate, and role-play better than anything we've had before. But they are probabilistic text engines, not systems of record.
In high-stakes domains, that gap shows fast. Across infrastructure and operations, IT leaders keep saying the same thing: when AI isn't wrapped around clearly defined processes, metrics, and owners, it stalls or under-delivers. "Just add AI" is not a strategy. It's a risk. And if you're treating the LLM as the brain of the system, you're already upside down.
What traditional AI actually brings
By "traditional AI" I mean the machinery that already runs your business (jargon incoming but ask your data scientist): gradient boosting, random forests, logistic regression; survival models for churn, default, or program completion; recommenders and anomaly detection; rule engines; contextual bandits for real-time optimisation.
In some consumer-products I've worked with, adding predictive AI increases business operations prediction accuracy from ~70% to ~97–98%. Traditional models are trainable and auditable on your data. They give you explicit features, coefficients, and consistent behaviour. They live under governance, monitoring, and version control. LLMs can't replace that foundation. What they can do is sit on top of it.
To keep it simple – predictive underneath, generative on top. This is the most robust architecture we use: GenAI handles unstructured interaction and orchestration passed on to traditional AI handles decisions, predictions, and constraints.
You see this in complex settings. Chatbots fine-tuned on expert dialogues only produced outcomes once they were wrapped in safety systems, risk classifiers, and human supervision. Then risk or allocation is run with rule-based safety flows and separate matching machine learning models. The LLM handles conversation and retrieval, not decision making.
For example, you don't want an LLM deciding whether to extend a loan, approve a claim, or change a dose. You want it to explain the decision, collect missing context, and orchestrate call-outs to the models that actually own those decisions.
How this plays out in FinTech and MarTech
For CFOs and FinTech teams, the stack looks like this: under the hood, risk models predict default, loss, and value. On the surface, a GenAI co-pilot explains declines and pricing, gathers missing data, and simulates "what if we adjust term, collateral, or exposure?" All based on calls to the risk engines. Regulators see lineage – customers see a human explanation, not a cryptic code.
For CMOs and martech teams, the pattern is similar. Traditional AI does segmentation and bandit-driven experimentation. GenAI generates and personalises content, constrained by brand and tone rules, regulatory filters, and the same framework. Letting a chatbot "be creative" with health or financial claims is a good way to collect regulators, not revenue.
Infrastructure implications you can't ignore
If you buy this, your infrastructure priorities shift: data plumbing first, multi-model routing, and evaluation harnesses.
You need reliable feature stores, labelling, and governance for predictive models, not just an LLM endpoint. Without that, the stack is just lipstick on a pig. Your orchestration layer has to route a request across LLMs (understanding + language), ML services (risk, fraud, churn, recommendation), and rule engines or knowledge bases. And you need "model bake-offs" comparing LLMs against fine-tuned or domain-specific models on cost, latency, safety, and task performance.
If you're a CIO, CFO, or CMO, start here: map 5-10 decisions (credit, pricing, fraud, retention, high-value CX flows). For each one, specify which model owns it today – or should. Design the conversation layer separately. Treat the LLM as a UX and orchestration service, not the brain. Invest in the unsexy bits, like data quality, feature stores, experiment platforms, and monitoring for both generative and predictive systems.
Generative AI on its own gives you clever words. Traditional AI on its own gives you cold decisions. If you want systems you can trust with real money, real customers, and real clinical risk, you need them working together in cohesion.