Your assistant should know when not to act

You keep adding capabilities to your assistant, and trust keeps eroding. Every new tool widens the blast radius, and your users start hedging every prompt with "but don't actually send it." The fix is not a smarter model. It is three design moves: action tiers, preview-before-act, and escalation-as-a-feature. The companies shipping the most capable agents today already ship this discipline — OpenAI's Operator hands control back to the user for logins, payments, and CAPTCHAs, and Anthropic's computer-use documentation tells you to ask for confirmation before consequential actions and keep humans in the loop where side effects matter.

By the end of this post you will know how to:

Sort every tool your assistant can call into three action tiers in 30 minutes
Replace "Done. I updated the account" with a preview your users approve in one click
Treat escalation as a product feature, not a fallback
Predict which two failure modes show up after you ship action tiers

The question is not "what can it do"

Most teams evaluate an assistant by asking what it can do. That is the wrong first question. The right one is: how does it decide when not to do something? Plenty of assistants can answer questions, draft content, or call tools. Very few reliably distinguish a safe read from a risky write from a high-consequence action that should wait for a human. A system that acts too early is not impressive — it is dangerous, and your buyers know it.

When buyers say they are worried about AI, they are not worried about the model being too weak. They are worried about an assistant editing records without approval, a workflow sending the wrong message to the wrong person, a tool inventing certainty where none exists, and a system doing something irreversible because the prompt sounded confident. The design goal is not "make the agent more capable." It is "make the agent safe enough to be useful."

The rule: design restraint before you design capability.

Start with action tiers, not tools

Classify every action by risk before you wire a single tool to the model. Three tiers cover most assistants:

Low risk — reading, searching, summarising, comparing, extracting. The assistant runs these automatically.
Medium risk — drafting a proposed change, preparing a payload, assembling a recommendation. The assistant prepares but does not execute.
High risk — updating records, sending messages, publishing, deleting, triggering external side effects. The assistant always asks for explicit approval.

Most failed assistant designs skip this step. They wire tools directly to the model and hope good instructions will be enough. They are not. A single prompt cannot reliably gate a delete and a search through the same code path. The tier has to live in the routing layer, not the system prompt.

The rule: read aggressively, write conservatively. Read operations create understanding. Write operations create consequences. Let the assistant look things up, check status, read history, and explain options without permission. Make it slow down to update records, contact people, or change state across systems.

Make "show me what will change" a product feature

The fastest way to earn write-permission from a user is to preview the write before executing it.

Replace:

Done. I updated the account.

With:

Here is what I plan to change:

Owner: Sarah Chen → Marcus Bell

Status: Qualified → Active Pipeline

Next follow-up: none → March 28

Approve this change?

That shift does two things. First, it catches mistakes before they become incidents. Second, it teaches the user how the system thinks, and that transparency is what builds confidence over weeks of use. The rule: a high-risk action without a preview is a future incident.

Escalation is a feature, not a failure

Product teams treat escalation as weakness. It is the opposite. A mature assistant escalates when the target is ambiguous, when an action has side effects outside the current system, when evidence is incomplete, when the cost of being wrong is materially high, or when the request conflicts with policy.

That is not the assistant giving up. It is the assistant recognising a boundary. The best AI assistants do not try to win every turn — they protect the user from bad turns. The rule: ship the escalation paths in the same sprint as the capabilities that need them, or you will ship neither.

Require grounding before claims

"Your inbox is clear." "There are no new replies." "This customer has not responded." These are not harmless sentences. They are factual claims about live state, and your assistant should only make them after reading the relevant source in the current flow.

Design the system to know the difference between a real observation from current data, a guess based on prior context, and a likely answer that has not been verified. Users forgive slowness more easily than false certainty. The rule: if the assistant cannot point at the source it just read, it cannot make the claim.

Safe defaults shrink the blast radius

Most operational mistakes do not come from malicious behaviour. They come from systems taking the biggest available action too early. Set the defaults the other way:

Draft instead of send
Ask before mutating state
Name uncertainty when evidence is incomplete
Suggest the next step when blocked
Choose smaller actions before bigger ones

And before allowing any action, ask whether it can be previewed, undone, logged, scoped, and retried safely. If the answer is no across the board, that action needs a stronger approval boundary. This is the hidden difference between a demo-friendly assistant and a production-ready one.

What you will hit after you ship this

Three predictions for the team that ships action tiers, preview, and escalation:

Your users will learn the difference between draft-and-confirm and full-auto, and they will start trusting full-auto less, not more. That is the goal. The "are you sure?" prompt on the high-risk path is what makes them comfortable letting the low-risk path run untouched. If you remove the friction on the high-risk path because users complain, you will lose them on the low-risk path within a month.
Your escalation rate will become the most important number on the dashboard. Not accuracy. Not latency. Escalation rate, segmented by tier, tells you whether your tier boundaries are calibrated. A 0% escalation rate on tier three means your boundaries are too loose. A 40% escalation rate on tier one means they are too tight.
One team member will lobby to "just let it send the email" within two weeks of shipping. That is the moment the design works. Hold the line. The preview is the product.

The path to earned autonomy

A great assistant should take initiative. Gather context, identify options, prepare work, surface risk, reduce cognitive load. But it has to earn the right to act. The sequence:

Read and understand the situation
State what it found
Show what it proposes to do
Wait when risk demands it
Execute when approval is clear
Confirm exactly what changed

That sequence changes how the product feels. Instead of "I hope this thing does not break something," the experience becomes "this system is careful, legible, and under control." That feeling is part of the product. Serious buyers are not asking for maximum autonomy — they are asking for useful autonomy inside safe boundaries.

The real lesson

Three sentences.

The easiest way to ruin trust in AI is to make the system look more certain than it is. The fastest way to build it is to make judgment visible — what the assistant knows, what it plans to do, and why it is waiting. Once that judgment is visible, useful autonomy becomes possible.

The safest assistants are not the ones that can do the most. They are the ones that know exactly when to pause.

If you are about to add another tool to your assistant, send me your assistant's current tool list and I will tell you which three actions need to move from auto to preview-before-confirm. [email protected].