Almost everyone has tried generative AI. Far fewer have shipped a generative AI feature that real users keep coming back to. The gap isn't the model — that part has gotten easier every quarter. The gap is the product design and engineering around the model, and that's where most generative AI builds quietly stall. Here's what actually works in real products, what doesn't, and why.
The pattern that ships: AI in a workflow, not on a blank page
The single biggest mistake in early generative AI features is putting a free-text box in front of users and hoping they figure out what to type. They don't. The blank-page problem is real — and it's why so many "AI inside" features see one excited week of usage and then a long tail of nothing.
The pattern that wins, almost without exception, is embedding generative AI inside an existing workflow. The user is already doing the task. Your feature offers to do part of it for them — draft, summarise, restructure, suggest. The user accepts, edits, or rejects. There's no blank page because the workflow is the prompt.
You see this pattern everywhere serious now: code editors that complete the line you're already writing, design tools that fill the layer you've already selected, CRMs that summarise the call you just had. The work is already framed. The AI just removes the slow part.
The patterns that actually work
Across the generative AI builds we've shipped, the same handful of patterns keep showing up because they're the ones that survive contact with real users:
- Draft, don't decide. Generate a first version of something — an email, a report, a contract clause — that the user finishes. Faster than starting from nothing, no risk of the AI deciding something it shouldn't.
- Summarise the long thing. Long transcripts, long documents, long histories. Easy to evaluate (was the summary right?) and immediately useful.
- Classify and route. Tag a ticket, sort an inbox, prioritise a queue. Predictable input, predictable output, cheap and accurate.
- Search that understands you. Semantic search over your own data — the unsexy "RAG" pattern that's still the most useful generative AI feature in most products.
- Conversational interfaces for the right surface — support, onboarding, narrow knowledge bases. Not for everything; for the cases where natural language genuinely beats a form. (See our legal chatbot Lexa for what that looks like at the deep end.)
What unites these: a clear input, a clear output, and a way to tell whether the system did a good job.
The patterns that quietly fail
It's worth being honest about the ones that don't work, because they keep getting built:
- "AI agents" that do everything. Long-running, multi-step agents are intoxicating to demo and brutal to ship reliably. Drift, error cascades, cost spikes. They work for narrow, well-instrumented tasks. They don't yet work as general-purpose assistants in production.
- Open-ended chat as a UI for everything. Chat is great when the user has a question and bad when they have a task. A button is usually a better UI than a sentence.
- "AI" as the whole pitch. Users don't buy AI; they buy outcomes. A product that leads with "powered by AI" instead of the result it delivers is signalling that it doesn't know what the result is yet.
If your roadmap has any of these, it's worth a second pass before you fund the build.
The engineering you can't skip
The model is the easy part. Generative AI features that survive production all share the same scaffolding, regardless of model choice:
- Retrieval for anything that needs to know your facts (RAG over your own data).
- Structured outputs — JSON schemas, typed responses, validation. "Give me a string and hope it parses" is a recipe for 3am incidents.
- Guardrails — input filtering, scoping, safe refusal, prompt-injection defences.
- Evaluation — a set of representative inputs and expected behaviours you re-run every time you change a prompt or model. Without this you can't tell if you've improved or regressed.
- Cost controls — token monitoring, model selection per task, caching. A generative AI feature that's profitable in testing can become uneconomical at scale if no one's watching.
- Observability — logs of what was retrieved, what was generated, what the user did with it. The single most valuable artefact for improving the feature later.
We wrote a deeper version of this in why most AI features fail in production — it's the same lessons, applied specifically to generative AI.
The UX makes or breaks adoption
Even a great generative AI feature underneath can flop if the UX hides what it's doing or oversells what it knows. The two principles we keep coming back to:
- Be honest about uncertainty. Hedge when the system has reason to. Decline when it shouldn't answer. We covered this at length in designing for trust.
- Make editing easy. The user finishes the AI's draft. That step should feel inviting, not like a chore.
Generative AI features users love are the ones that feel like they did the work, with help — not the ones that feel like the AI did it for them and they have to clean up.
What to ship first
If you're standing at the start of a generative AI feature and wondering where to begin, the order that works is:
- Pick one workflow in your product where AI would remove obvious friction.
- Define the specific input, output, and quality signal for that one workflow.
- Ship the smallest version of that — hosted model, RAG if needed, structured output, basic guardrails.
- Put it in front of real users. Watch what they do. Measure.
- Iterate from there.
That's it. No agent platform, no model fine-tuning marathon, no infrastructure overhaul. The teams that ship great generative AI features in 2025 and 2026 are the ones that pick one thing and engineer the boring parts well.
We've helped clients ship generative AI features into 11+ shipped products, and the playbook above is the one that consistently survives the trip from demo to production. If you're sitting on a generative AI roadmap and wondering which bits are real and which are hype, book a call — we'll tell you which is which.