How I Design Trust Boundaries for AI Systems
One of the fastest ways to create AI chaos is to skip the trust-boundary discussion.
A team gets excited about a workflow, wires in a model, and only later asks the important questions:
- what is the system allowed to see?
- what is it allowed to do?
- what should always require approval?
- what happens when the output is weak?
By that point, the cleanup is usually more painful than the original design work would have been.
The Simple Model I Use
I like to sort AI behavior into four levels:
- observe
- prepare
- recommend
- act
This is a useful way to think because it forces precision.
1. Observe
At this level, the system can read context.
Examples:
- documents
- tickets
- CRM state
- product data
- transcripts
- internal notes
This sounds low-risk, but even here the team still needs to decide:
- what data is in scope?
- what permissions apply?
- what should never be exposed?
- how fresh does the context need to be?
2. Prepare
This is where a lot of the best first workflows live.
The system can:
- summarize
- classify
- structure information
- assemble context
- draft a first pass
This is powerful because it improves the starting point without pretending the system should own the final decision.
3. Recommend
At this level, the system starts proposing what should happen next.
Examples:
- suggest the next support action
- rank likely classifications
- recommend a product action
- propose a response or plan
This is where review design becomes more important.
The team needs to decide:
- who checks the recommendation?
- what confidence cues should exist?
- what low-confidence behavior should trigger escalation?
4. Act
This is the level teams often want to jump to too early.
Here the system can actually trigger something:
- send a message
- update a record
- queue a task
- publish something
- call an external API with real consequences
Sometimes this is appropriate. But once the system can act, the design needs to become much stricter.
Questions that matter:
- which actions always require approval?
- what gets logged?
- what permissions apply?
- what rollback or correction path exists?
- what should happen when the system is uncertain?
Why Most First Versions Should Stay Lower on the Ladder
A lot of value shows up before action.
A system that prepares and recommends well can already:
- save time
- improve consistency
- reduce context-switching
- give humans a stronger first pass
That is often enough to create real leverage.
Moving to action too early usually creates trust problems faster than it creates value.
Trust Boundaries Are Not Just About Safety
They are also about usability.
A well-designed boundary makes the workflow easier to understand.
People know:
- what the system does
- what it does not do
- where human judgment still matters
- when they should trust the output and when they should slow down
That clarity is part of good product design.
A Practical Example
Imagine an internal support system.
A weak boundary sounds like:
The AI handles support.
A better boundary sounds like:
The system can read the ticket, account history, and help docs. It can draft a reply and suggest a next action. It cannot send the reply or close the ticket without human approval.
That is much easier to build, review, and trust.
Final Thought
Trust boundaries should be designed before implementation, not discovered after a bad launch.
If the team can say clearly what the system can observe, prepare, recommend, and act on, the project usually gets much stronger.
That clarity is one of the main things that separates a useful AI system from a messy one.