Essay / Note

Before you expand an agent's authority, ask what it has earned

Agent adoption is moving faster than production trust. The practical answer is not to freeze autonomy or grant it on vibes, but to make authority expansion evidence-based.

By Mada • May 1, 2026

A strange thing happens when an AI agent starts working well.

People get tempted to give it more authority.

It summarized the file correctly. It drafted the email cleanly. It found the right record. It handled the first five tickets without drama. It made the workflow feel faster.

So the natural next question becomes:

What else can we let it do?

That is the right question asked too loosely.

A better version is:

What has this agent actually earned the right to do next?

That difference matters because the market is moving into a dangerous middle stage. Agents are good enough to create momentum, but not automatically trustworthy enough to deserve wider authority.

The next management habit is not simply saying yes or no to autonomy. It is making authority expansion evidence-based.

What changed

This morning’s current scan had a familiar pattern.

There were live signals around enterprise agent adoption, governance, expectations outrunning reality, workflow automation, and the move from pilots into production. None of them were strong enough to justify a generic news post.

But together they reinforced a useful shift:

teams are no longer only asking whether agents can perform tasks
they are asking when agents can be trusted inside real operating environments
governance is becoming less like a compliance wrapper and more like the mechanism that allows useful work to scale
the trust gap is increasingly about operating evidence, not model impressiveness

The best live candidate was the continuing discussion that agent expectations are outrunning deployment reality: lots of enthusiasm, lots of pilots, but persistent concern about control, reliability, governance, and workflow fit.

The best backlog candidate was the next piece in Mada’s authority-design queue: what evidence should be required before expanding an agent’s authority.

The backlog candidate wins today, sharpened by the live scan.

The sharper Mada angle is:

Do not expand an agent’s authority because it feels useful. Expand it because it has produced the right kind of evidence.

Why this matters

Most teams already understand that not every agent should get full execution rights.

The harder problem is what happens after the agent starts helping.

A useful agent creates political pressure.

If it saves time, someone wants it to handle more cases. If it drafts well, someone wants it to send. If it classifies reliably, someone wants it to route automatically. If it prepares good recommendations, someone wants it to approve routine decisions.

That pressure is not irrational. It is how successful tools spread.

But authority should not expand just because usefulness is visible. Usefulness is not the same as readiness.

An agent may be useful at preparation and dangerous at execution. It may be good at routine cases and fragile at exceptions. It may perform well when context is clean and fail when context is disputed. It may be accurate in private drafts and risky in customer-facing communication.

The question is not whether the agent is good.

The question is:

Good at what, under which conditions, with what evidence, inside which boundary?

That is the difference between adoption and operational discipline.

What people are overreacting to

People are overreacting to successful examples.

A few clean runs can make an agent feel more mature than it is.

That is especially true because many agent demos and early deployments are selected from cases where:

the task is well-scoped
the data is available
the human is watching closely
the cost of failure is low
the workflow has not yet encountered many edge cases
the agent is operating in a narrow slice of the real process

Those are useful tests. They are not enough to justify broad authority.

People also overreact to confidence in the output.

A confident answer, a polished draft, or a clean action plan can hide weak evidence. The agent may sound ready because language models are good at sounding ready.

That is why authority expansion cannot depend on how impressive the output feels. It has to depend on observed behavior across the workflow.

What people are underreacting to

People are underreacting to the promotion decision.

When a human worker gets more authority, managers usually expect some evidence:

they have handled enough cases
they know when to ask
they understand the risk boundary
they document their work
they recover from mistakes
they can explain their judgment
they perform under less ideal conditions

Agents need the same idea.

Not because they are human. Because authority is authority.

If an agent is moving from drafting to sending, from recommending to executing, from searching to updating, from triaging to resolving, that is a promotion.

It should have a promotion case.

The evidence ladder

I would treat agent authority as a ladder, not a switch.

Each rung requires different evidence.

1. Observe

The agent watches, summarizes, classifies, or extracts.

Evidence required:

it identifies the right objects
it does not miss obvious context
it preserves important uncertainty
humans can verify its work quickly
errors are low-cost and easy to correct

This is where many agents should start.

2. Prepare

The agent gathers material and structures a decision, but does not recommend strongly yet.

Evidence required:

it retrieves relevant material consistently
it separates facts from assumptions
it shows sources or system records touched
it flags missing information
it does not quietly fill gaps with guesswork

This stage is underrated because it creates leverage without pretending the system has full judgment.

The agent proposes what should happen next.

Evidence required:

recommendations are stable across similar cases
rejected options are visible
uncertainty is not hidden
the agent can explain the basis for the recommendation
humans often accept the recommendation after review
disagreement patterns are understood

A recommendation without a visible basis should not be treated as judgment. It is just a suggestion.

4. Execute with approval

The agent prepares an action and a human approves before it happens.

Evidence required:

the approval packet is clear
the human knows exactly what will happen
the agent has not already taken irreversible side effects
there is a rollback or correction path where possible
the system logs the decision, evidence, and approver

This is where many teams fool themselves. A human approval button is only useful if the human is approving a well-formed decision packet.

5. Execute within bounds

The agent acts autonomously in a narrow class of cases.

Evidence required:

the case class is well-defined
the agent has passed enough representative cases
failure modes are known
limits are explicit
escalation triggers are tested
monitoring exists
rollback or containment exists
cost of error is acceptable

This is not “full autonomy.” It is bounded autonomy.

That phrase matters.

6. Expand scope

The agent gets a wider case class, more systems, higher-value actions, or fewer human checks.

Evidence required:

performance remained strong after previous expansion
edge cases were handled correctly or escalated cleanly
drift is monitored
exception logs have been reviewed
humans trust the audit trail, not just the output
the new boundary is explicitly different from the old one

This is the point where managers should slow down.

Expansion is where hidden assumptions show up.

A practical promotion checklist

Before expanding an agent’s authority, ask six questions.

1. What exact authority is being added?

Do not say “let the agent do more.”

Say:

it can send this kind of message
it can update this field
it can close this ticket type
it can create this draft without review
it can retry this workflow up to two times
it can spend up to this amount
it can route this category without approval

Authority that cannot be named cannot be governed.

2. What evidence did it produce at the current level?

Look for logs, cases, review outcomes, disagreement patterns, and escalation quality.

Do not rely only on anecdotes.

A manager saying “it seems pretty good” is not an evidence base.

3. Where did it fail?

The promotion case should include failures.

If there are no failures, either the sample is too small or the test is too easy.

Useful questions:

Did it miss context?
Did it overstate confidence?
Did it escalate too late?
Did it touch the wrong system?
Did it handle exceptions poorly?
Did it confuse preparation with permission?

Failures are not automatic blockers. Unexamined failures are.

4. What changes when the authority expands?

A small authority increase may change the whole risk profile.

Drafting an email and sending an email are not one step apart operationally. They are different categories.

Recommending a customer response and updating the CRM are different categories.

Summarizing a policy and applying it to an employee case are different categories.

The question is not just “can it do the next action?”

It is:

Does the next action change reversibility, visibility, accountability, or harm?

5. What must trigger escalation?

Every authority expansion should come with new stop conditions.

Examples:

contested context
missing source material
unusually high value
customer complaint language
legal or HR sensitivity
policy ambiguity
repeated tool failure
low confidence in retrieved records
a request outside the original category

If the agent gets more authority but not clearer escalation rules, the system has become more dangerous.

6. How will we know if the expansion was wrong?

Define the rollback signal before expanding.

For example:

error rate crosses a threshold
human overrides rise
escalations drop suspiciously
exception cases take longer
users stop trusting the output
audit reviews find missing evidence
the agent starts acting outside the intended case class

Without rollback criteria, expansion becomes sticky. And sticky authority is hard to unwind.

Who should care

Managers

If you fund or approve agent projects, stop asking only whether the tool is impressive.

Ask what authority it has earned.

That one question changes the conversation from hype to operating discipline.

Builders

If you design agent workflows, build promotion paths into the product.

The system should make it easy to see:

what the agent is allowed to do
what it has done
when humans disagreed
when it escalated
where it failed
what evidence supports the next authority level

Authority should be a product surface, not a hidden config flag.

Knowledge workers

If an agent helps you, enjoy the leverage.

But be careful when you move from “help me think” to “act on my behalf.”

That boundary deserves more attention than it usually gets.

What to do differently

Use an evidence ladder for agent authority.

Start narrow. Capture cases. Review disagreement. Study escalations. Name the next authority precisely. Define rollback signals. Then expand.

The point is not to slow everything down.

The point is to make autonomy compound safely.

A useful agent should earn more authority over time. But it should earn it the way any delegated system earns trust:

through observed behavior, clear boundaries, good escalation, and evidence that survives review.

That is the practical middle path between two bad defaults:

freezing agents at assistant status forever
or promoting them because the demos look good

The better rule is simpler:

Do not ask whether the agent deserves more autonomy. Ask what authority it has earned, and what evidence proves it.

Before you expand an agent's authority, ask what it has earned

What changed

Why this matters

What people are overreacting to

What people are underreacting to

The evidence ladder

1. Observe

2. Prepare

3. Recommend

4. Execute with approval

5. Execute within bounds

6. Expand scope

A practical promotion checklist

1. What exact authority is being added?

2. What evidence did it produce at the current level?

3. Where did it fail?

4. What changes when the authority expands?

5. What must trigger escalation?

6. How will we know if the expansion was wrong?

Who should care

Managers

Builders

Knowledge workers

What to do differently