Back to articles

Stop building flexible AI. Build opinionated AI.

Flexible-everything AI products feel modern and break in production. The frontier vendors already shipped the answer — encode judgment into the default path.


The most reliable AI systems do not improvise their operating principles on every run. They encode judgment into the default path.

You leave the model broad, expose every tool, and avoid hard rules so the assistant can "figure it out." Three weeks in, users are babysitting every action, your prompt has grown a paragraph for each edge case, and your team is in a Slack thread arguing about whether the agent should have asked before sending. The frontier vendors already decided. OpenAI's Introducing Operator describes specific moments where the system pauses and hands control back to the user, and Anthropic's computer use guidance tells you to require explicit confirmation before consequential actions. Those are not neutral defaults — they are opinions shipped as product.

By the end of this post, you will know:

  • Why "flexible" AI design quietly pushes supervision onto your users
  • The three opinions the frontier vendors already encoded that you can copy on Monday
  • How vague systems generate prompt sprawl as a structural failure mode, not a discipline problem
  • The two or three opinions that buy you the most trust per line of code

Opinionated is not the opposite of flexible. Vague is.

When an AI system avoids strong operating choices, it does not become more adaptable. It becomes more vague. And vague systems force the model, the user, and the operators to improvise on every turn.

An opinionated system has strong defaults and explicit judgments about how work should happen. Look at what the frontier teams chose to encode: Operator pauses on consequential moves and asks the user to take over. Anthropic tells you to require explicit confirmation before any write or destructive action. Both products treat reads as cheaper than writes, treat preview-before-confirm as the default, and treat ambiguity as a reason to escalate rather than guess. None of that is a model capability. All of it is a product decision.

The rule: pick the three behaviors your product will commit to even when no one is watching, and encode them below the prompt. If they live only in the system prompt, they are suggestions. If they live in the tool layer — read tools and write tools separated, write tools gated on confirmation, ambiguity routed to a human — they are opinions.

Generic systems push the work onto your users

A highly generic assistant looks like freedom and feels like supervision. Your user has to clarify risk on every turn, watch every action, correct the system's posture by hand, reconstruct what happened after a failure, and re-teach the boundaries in the next session. That is not leverage. That is unpaid QA.

The Operator pattern inverts this. The system knows which actions need a human before they fire, and asks once. The user does not have to remember which buttons are dangerous, because the product already decided. The Anthropic computer-use guidance does the same job in a different domain: the model has the capability to act, and the product has the opinion about when not to.

The rule: if a user has to issue the same correction twice in one session, that correction belongs in the product, not in the conversation.

Good opinions create trust, in a specific way

Users do not trust an AI system because it can do many things. They trust it because it behaves consistently in ways that make sense for the work in front of them. Three behaviors carry most of the weight, and all three are the ones the frontier vendors already adopted:

  • It previews before it changes. A write action shows what it is about to do and waits for one click. Anthropic's computer-use docs call this out directly for consequential actions.
  • It pauses when ambiguity appears. Operator hands control back when the next move is not safe to make alone. Your version of this is escalation on low-confidence routing, not a more confident prompt.
  • It separates reads from writes. Read tools run freely. Write tools live behind a confirmation. This is a structural choice in the tool layer, not a politeness rule in the prompt.

These three opinions cover the moments where AI systems most often lose user trust. Ship them as defaults and the rest of the product gets simpler — because the prompt no longer has to relitigate the same boundaries on every turn.

What the oracle predicts when you skip this

Three things happen to teams that keep their AI systems "flexible" instead of opinionated. You will recognise at least one of them:

  1. Your prompt grows a paragraph per edge case. Every new failure mode produces a new sentence in the system prompt. Six weeks in, the prompt is 1,200 tokens of "do this, but not when," and nobody on the team can predict what it will do on a case the prompt does not mention. The opinion was never encoded in the product, so the prompt became the product.
  2. Your support load tracks user count linearly. Trust never compounds, because the same dangerous-action conversation has to happen every time a new user shows up. Opinionated products earn trust once at the design layer. Vague products earn it one user at a time.
  3. Your next regression will be silent and unattributable. Without explicit defaults — preview before write, escalate on ambiguity, read vs write separation — you cannot tell whether a bad outcome came from a bad model decision, a missing rule, or a tool boundary you never drew. Opinionated systems are easier to debug because each behavior maps to a specific design choice you can point at. Vague systems leave you guessing.

If your roadmap has "make the agent smarter" on it more often than "decide what the agent should never do," you are heading into all three.

The opinions worth having first

Not every opinion is equally valuable. The ones that buy you the most trust per line of code are about risk boundaries, approval posture, evidence requirements, failure handling, context discipline, and output structure. The frontier vendors have already told you which three to start with: preview-before-confirm on writes, escalation on ambiguity, and a hard separation between reads and writes. Those are not the only opinions you will ever need. They are the ones whose absence will cost you trust fastest.

The strongest opinions feel invisible in good systems. The user experiences the product as clear, fast enough, safe, consistent, and easy to trust. The opinion is still there — it is expressed through behavior rather than marketing language.

The real lesson

Three sentences.

The opposite of opinionated is not flexible; it is vague, and vague systems force everyone around them to improvise. The frontier vendors already chose the first three opinions worth shipping — preview before write, escalate on ambiguity, separate reads from writes — and you can copy them on Monday. If your product cannot answer "what behaviors do we want to be the default even when no one is watching," your users are doing that job for you on every turn.


If you are running an AI product that feels too eager to act, send me the three loudest behaviors it takes without asking, and I will tell you which one should be the next opinion you encode below the prompt. [email protected].