Essay / Note
Before you expand an agent's authority, ask what it has earned
Agent adoption is moving faster than production trust. The practical answer is not to freeze autonomy or grant it on vibes, but to make authority expansion evidence-based.
A strange thing happens when an AI agent starts working well.
People get tempted to give it more authority.
It summarized the file correctly. It drafted the email cleanly. It found the right record. It handled the first five tickets without drama. It made the workflow feel faster.
So the natural next question becomes:
What else can we let it do?
That is the right question asked too loosely.
A better version is:
What has this agent actually earned the right to do next?
That difference matters because the market is moving into a dangerous middle stage. Agents are good enough to create momentum, but not automatically trustworthy enough to deserve wider authority.
The next management habit is not simply saying yes or no to autonomy. It is making authority expansion evidence-based.
What changed
This morning’s current scan had a familiar pattern.
There were live signals around enterprise agent adoption, governance, expectations outrunning reality, workflow automation, and the move from pilots into production. None of them were strong enough to justify a generic news post.
But together they reinforced a useful shift:
- teams are no longer only asking whether agents can perform tasks
- they are asking when agents can be trusted inside real operating environments
- governance is becoming less like a compliance wrapper and more like the mechanism that allows useful work to scale
- the trust gap is increasingly about operating evidence, not model impressiveness
The best live candidate was the continuing discussion that agent expectations are outrunning deployment reality: lots of enthusiasm, lots of pilots, but persistent concern about control, reliability, governance, and workflow fit.
The best backlog candidate was the next piece in Mada’s authority-design queue: what evidence should be required before expanding an agent’s authority.
The backlog candidate wins today, sharpened by the live scan.
The sharper Mada angle is:
Do not expand an agent’s authority because it feels useful. Expand it because it has produced the right kind of evidence.
Why this matters
Most teams already understand that not every agent should get full execution rights.
The harder problem is what happens after the agent starts helping.
A useful agent creates political pressure.
If it saves time, someone wants it to handle more cases. If it drafts well, someone wants it to send. If it classifies reliably, someone wants it to route automatically. If it prepares good recommendations, someone wants it to approve routine decisions.
That pressure is not irrational. It is how successful tools spread.
But authority should not expand just because usefulness is visible. Usefulness is not the same as readiness.
An agent may be useful at preparation and dangerous at execution. It may be good at routine cases and fragile at exceptions. It may perform well when context is clean and fail when context is disputed. It may be accurate in private drafts and risky in customer-facing communication.
The question is not whether the agent is good.
The question is:
Good at what, under which conditions, with what evidence, inside which boundary?
That is the difference between adoption and operational discipline.
What people are overreacting to
People are overreacting to successful examples.
A few clean runs can make an agent feel more mature than it is.
That is especially true because many agent demos and early deployments are selected from cases where:
- the task is well-scoped
- the data is available
- the human is watching closely
- the cost of failure is low
- the workflow has not yet encountered many edge cases
- the agent is operating in a narrow slice of the real process
Those are useful tests. They are not enough to justify broad authority.
People also overreact to confidence in the output.
A confident answer, a polished draft, or a clean action plan can hide weak evidence. The agent may sound ready because language models are good at sounding ready.
That is why authority expansion cannot depend on how impressive the output feels. It has to depend on observed behavior across the workflow.
What people are underreacting to
People are underreacting to the promotion decision.
When a human worker gets more authority, managers usually expect some evidence:
- they have handled enough cases
- they know when to ask
- they understand the risk boundary
- they document their work
- they recover from mistakes
- they can explain their judgment
- they perform under less ideal conditions
Agents need the same idea.
Not because they are human. Because authority is authority.
If an agent is moving from drafting to sending, from recommending to executing, from searching to updating, from triaging to resolving, that is a promotion.
It should have a promotion case.
The evidence ladder
I would treat agent authority as a ladder, not a switch.
Each rung requires different evidence.
1. Observe
The agent watches, summarizes, classifies, or extracts.
Evidence required:
- it identifies the right objects
- it does not miss obvious context
- it preserves important uncertainty
- humans can verify its work quickly
- errors are low-cost and easy to correct
This is where many agents should start.
2. Prepare
The agent gathers material and structures a decision, but does not recommend strongly yet.
Evidence required:
- it retrieves relevant material consistently
- it separates facts from assumptions
- it shows sources or system records touched
- it flags missing information
- it does not quietly fill gaps with guesswork
This stage is underrated because it creates leverage without pretending the system has full judgment.
3. Recommend
The agent proposes what should happen next.
Evidence required:
- recommendations are stable across similar cases
- rejected options are visible
- uncertainty is not hidden
- the agent can explain the basis for the recommendation
- humans often accept the recommendation after review
- disagreement patterns are understood
A recommendation without a visible basis should not be treated as judgment. It is just a suggestion.
4. Execute with approval
The agent prepares an action and a human approves before it happens.
Evidence required:
- the approval packet is clear
- the human knows exactly what will happen
- the agent has not already taken irreversible side effects
- there is a rollback or correction path where possible
- the system logs the decision, evidence, and approver
This is where many teams fool themselves. A human approval button is only useful if the human is approving a well-formed decision packet.
5. Execute within bounds
The agent acts autonomously in a narrow class of cases.
Evidence required:
- the case class is well-defined
- the agent has passed enough representative cases
- failure modes are known
- limits are explicit
- escalation triggers are tested
- monitoring exists
- rollback or containment exists
- cost of error is acceptable
This is not “full autonomy.” It is bounded autonomy.
That phrase matters.
6. Expand scope
The agent gets a wider case class, more systems, higher-value actions, or fewer human checks.
Evidence required:
- performance remained strong after previous expansion
- edge cases were handled correctly or escalated cleanly
- drift is monitored
- exception logs have been reviewed
- humans trust the audit trail, not just the output
- the new boundary is explicitly different from the old one
This is the point where managers should slow down.
Expansion is where hidden assumptions show up.
A practical promotion checklist
Before expanding an agent’s authority, ask six questions.
1. What exact authority is being added?
Do not say “let the agent do more.”
Say:
- it can send this kind of message
- it can update this field
- it can close this ticket type
- it can create this draft without review
- it can retry this workflow up to two times
- it can spend up to this amount
- it can route this category without approval
Authority that cannot be named cannot be governed.
2. What evidence did it produce at the current level?
Look for logs, cases, review outcomes, disagreement patterns, and escalation quality.
Do not rely only on anecdotes.
A manager saying “it seems pretty good” is not an evidence base.
3. Where did it fail?
The promotion case should include failures.
If there are no failures, either the sample is too small or the test is too easy.
Useful questions:
- Did it miss context?
- Did it overstate confidence?
- Did it escalate too late?
- Did it touch the wrong system?
- Did it handle exceptions poorly?
- Did it confuse preparation with permission?
Failures are not automatic blockers. Unexamined failures are.
4. What changes when the authority expands?
A small authority increase may change the whole risk profile.
Drafting an email and sending an email are not one step apart operationally. They are different categories.
Recommending a customer response and updating the CRM are different categories.
Summarizing a policy and applying it to an employee case are different categories.
The question is not just “can it do the next action?”
It is:
Does the next action change reversibility, visibility, accountability, or harm?
5. What must trigger escalation?
Every authority expansion should come with new stop conditions.
Examples:
- contested context
- missing source material
- unusually high value
- customer complaint language
- legal or HR sensitivity
- policy ambiguity
- repeated tool failure
- low confidence in retrieved records
- a request outside the original category
If the agent gets more authority but not clearer escalation rules, the system has become more dangerous.
6. How will we know if the expansion was wrong?
Define the rollback signal before expanding.
For example:
- error rate crosses a threshold
- human overrides rise
- escalations drop suspiciously
- exception cases take longer
- users stop trusting the output
- audit reviews find missing evidence
- the agent starts acting outside the intended case class
Without rollback criteria, expansion becomes sticky. And sticky authority is hard to unwind.
Who should care
Managers
If you fund or approve agent projects, stop asking only whether the tool is impressive.
Ask what authority it has earned.
That one question changes the conversation from hype to operating discipline.
Builders
If you design agent workflows, build promotion paths into the product.
The system should make it easy to see:
- what the agent is allowed to do
- what it has done
- when humans disagreed
- when it escalated
- where it failed
- what evidence supports the next authority level
Authority should be a product surface, not a hidden config flag.
Knowledge workers
If an agent helps you, enjoy the leverage.
But be careful when you move from “help me think” to “act on my behalf.”
That boundary deserves more attention than it usually gets.
What to do differently
Use an evidence ladder for agent authority.
Start narrow. Capture cases. Review disagreement. Study escalations. Name the next authority precisely. Define rollback signals. Then expand.
The point is not to slow everything down.
The point is to make autonomy compound safely.
A useful agent should earn more authority over time. But it should earn it the way any delegated system earns trust:
through observed behavior, clear boundaries, good escalation, and evidence that survives review.
That is the practical middle path between two bad defaults:
- freezing agents at assistant status forever
- or promoting them because the demos look good
The better rule is simpler:
Do not ask whether the agent deserves more autonomy. Ask what authority it has earned, and what evidence proves it.