Essay / Note

The missing layer in AI systems is authority design

A lot of current AI discussion focuses on capability, autonomy, and human-in-the-loop slogans. The more practical question is who can authorize what, under which conditions, and where approval boundaries actually belong.

By Mada • Apr 16, 2026

A lot of current AI discussion is finally getting more serious.

People are talking less like:

Can the model do the task?

and more like:

Can we trust the system to operate inside a real workflow?

That is progress.

But I still think a lot of the conversation is missing the most practical layer.

The real question is not just whether the system is smart. It is not even just whether a human stays “in the loop.”

It is this:

Who has authority to let the system do what, under which conditions, and with what checkpoint before the risk becomes real?

That is an authority-design problem.

And I think people are still underreacting to it.

What changed

A few different signals are converging right now:

more agent systems are moving from demo territory into actual operating environments
more discussion is shifting from raw capability to governance, permissions, and oversight
more practitioners are noticing that the real failure mode is not always a bad answer — it is an unwise action
more security and enterprise commentary is treating agents less like software features and more like workers with delegated access

That is the useful shift.

Because once a system can:

call tools
access systems
trigger workflows
move data
draft messages that may actually get sent
modify configuration
or chain those things together

then the important question stops being “is the model impressive?”

The important question becomes:

Where does authority begin, where does it stop, and how is that boundary enforced?

What people are overreacting to

I think people still overreact to the autonomy story.

They ask:

how autonomous can this become?
how few humans can we keep involved?
how much end-to-end work can the agent own?

That is understandable. It is also how a lot of teams walk into preventable mess.

Autonomy is not the same thing as good system design.

Sometimes more autonomy means:

blurrier accountability
harder debugging
hidden risk transfer
accidental permission creep
humans reviewing only after the damage is already done

A lot of “human in the loop” discussion is also too soft. It sounds responsible. But it often stays vague.

If someone says there is human oversight, I want to know:

at which exact step?
before which exact action?
with what context shown to the human?
with what ability to edit, reject, or narrow the scope?
and what happens if the human does nothing?

Without those details, “oversight” is often just a comforting phrase.

What people are underreacting to

The underreaction is that authority design is becoming a first-class product and management problem.

Not a compliance afterthought. Not only a security team’s concern. A core design decision.

In practice, most agent failures that matter are not only intelligence failures. They are authority failures.

Examples:

the system should have been allowed to draft, but not send
it should have been allowed to recommend, but not approve
it should have been allowed to classify, but not escalate automatically
it should have been allowed to prepare the change, but not execute it
it should have been allowed to search internal data, but not cross certain boundaries without explicit approval

This is why the human-in-the-loop framing is sometimes too blunt.

The better question is not:

Is there a human somewhere?

It is:

What kind of authority is delegated, what kind is retained, and where does the handoff happen?

That is a much better operating question.

The practical model I would use

When I look at an AI workflow, I want to separate four layers.

1. Perception

What can the system see?

documents
inboxes
tickets
calendars
dashboards
internal notes
production state

A surprising number of risks begin here. A system with broad read access already has real power, even before it acts.

2. Recommendation

What can the system suggest?

summaries
classifications
next-best actions
draft replies
draft plans
ranked options

This layer is safer than execution, but it still shapes behavior. A bad recommendation engine can quietly train humans into bad habits.

3. Preparation

What can the system set up?

pre-filled forms
drafted emails
proposed tickets
staged config changes
selected tools and arguments
queued actions waiting for review

This is an underrated sweet spot. A lot of useful systems should live here longer than the market currently wants to admit.

Because preparation creates leverage without pretending trust has already been earned.

4. Execution

What can the system actually do without asking again?

send
approve
purchase
publish
modify
deploy
revoke
contact customers
write to production systems

This is where authority becomes real. And this is where the bar should rise sharply.

Why this matters for managers

If you are a manager buying or building AI systems, I think one of the most useful questions you can ask is:

What is the highest-consequence action this system can take without a human checkpoint?

That question cuts through a lot of marketing very quickly.

Then ask:

What can it read?
Sensitive context plus weak controls is already a governance issue.
What can it stage?
Preparation often creates most of the value before full autonomy is safe.
What can it execute?
This is the real authority line, not the product demo.
Who approves exceptions?
If that answer is fuzzy, the system is not operationally mature.
Can we narrow permissions by task, not just by user?
A lot of AI control models are still too coarse.
If it fails, do we know whether the problem was judgment, data, policy, or authorization?
If not, your review loop is too weak.

The management mistake is thinking this is mostly a prompt-quality issue. It is usually a delegation-design issue.

Why this matters for builders

If you are building AI products, there is a practical trap here.

The market rewards products that look highly autonomous. But production trust usually grows in stages.

That means the builder discipline is not:

maximize agent freedom

It is:

design authority gradients

In other words:

let the system see more before it can do more
let it recommend before it can execute
let it prepare before it can commit
let it escalate before it can override
let permissions get narrower as consequences get higher

That is less magical in a launch demo. It is much more useful in a real organization.

If I were designing serious agent systems right now, I would obsess over:

approval boundaries
reversible actions
staged execution
policy-aware tool access
exception routing
auditability of who delegated what to the system

Not because those things are glamorous. Because they are where trust is actually built.

A simple test for any AI workflow

Here is the short test I would use.

For each meaningful action, ask:

Can the system see this?
Can it suggest something about this?
Can it prepare something about this?
Can it execute something about this?
Who can authorize the jump from one level to the next?

If those answers are unclear, the workflow is not really designed yet. It is just hopeful.

Working thesis

My current view is this:

The next important design layer in AI systems is not just intelligence, and not just “human in the loop.” It is authority design.

That means deciding:

what the system may observe
what it may recommend
what it may stage
what it may execute
and where humans actually retain control in a way that matters

People who keep focusing only on capability will miss where a lot of the real product and management work is moving.

Because once AI systems start acting inside real workflows, the decisive question is no longer:

Can it do the job?

It is:

Should it be allowed to do this part of the job yet?

That is a better question. And increasingly, it is the one that will matter most.