Essay / Note

The next agent management problem is context trust

As shared workspace agents spread across ChatGPT, Slack, browsers, and internal tools, the practical risk is no longer only what agents can access. It is also what they should trust when outside content can quietly steer long-running work.

By Mada

A lot of current agent discussion still treats safety as an access-control problem.

People ask:

  • what tools the agent can use
  • what systems it can reach
  • whether it needs approval before acting
  • how narrowly its permissions are scoped

Those questions matter. But I think they are becoming incomplete.

The next agent management problem is not only access control. It is context trust.

That sounds abstract. It is not.

If an agent can browse the web, read Slack, pull from internal docs, inspect tickets, summarize threads, and keep working across multiple steps, then the quality of its decisions depends not only on what it may touch. It also depends on which inputs it should treat as trustworthy, advisory, suspicious, or ignorable.

And I think the market is still underreacting to that shift.

What changed

Two current signals point in the same direction.

First, OpenAI introduced workspace agents in ChatGPT: shared cloud agents that can operate across tools, run on schedules, work inside Slack, and keep going across longer workflows with memory, connected apps, and configurable approvals.

Second, Google’s security team is pushing harder on prompt injection and indirect prompt injection as a real-world threat surface for browsing agents and connected AI systems. The practical warning is simple: the model is no longer only reading trusted instructions from the user. It may also ingest hostile or misleading instructions embedded inside the content it is sent to read.

Put those together and the management question changes.

Once agents become shared workers rather than one-shot assistants, the problem is no longer just:

What can this agent do?

It also becomes:

What should this agent believe?

That is a different operating problem. And it is growing faster than a lot of approval-centric governance language admits.

Why this matters

Permissions answer one question:

  • what the agent is allowed to touch

Context trust answers another:

  • what the agent is willing to treat as guidance while touching it

Those are not the same.

An agent can have beautifully scoped access and still be steered badly.

For example:

  • a research agent can browse the wrong source and import a poisoned frame
  • a support-routing agent can over-trust a noisy forum thread and escalate the wrong issue
  • a vendor-risk agent can absorb manipulative content from a target company website
  • a workflow agent can read a Slack thread full of stale assumptions and carry them forward as if they were current instructions

In each case, the system may not be over-permissioned. It may simply be over-trusting.

That matters because current product design is clearly moving toward:

  • longer-running work
  • shared team agents
  • more connected tools
  • more browsing and retrieval
  • more autonomous preparation before human review

The more context an agent consumes, the more important context hygiene becomes.

What people are overreacting to

I think people are still overreacting to the visible permission layer.

That shows up as a lot of emphasis on things like:

  • require approval before sending an email
  • require approval before editing a spreadsheet
  • limit which apps can connect
  • narrow the systems the agent can write to

Those are good controls. But they are not sufficient if the agent is already making bad upstream judgments because it trusted the wrong evidence, the wrong embedded instruction, or the wrong contextual cue.

A late approval on a polished bad draft is not much comfort. It usually means the expensive mistake already happened earlier.

The system already:

  • chose the wrong frame
  • prioritized the wrong evidence
  • anchored on the wrong objective
  • imported hostile or noisy guidance into the workflow

At that point, approval is often cleanup. Not prevention.

What people are underreacting to

I think people are underreacting to the fact that shared agents widen the attack and confusion surface through context, not only through actions.

When an agent is operating across:

  • Slack
  • documents
  • tickets
  • web pages
  • internal wikis
  • connected apps
  • scheduled background runs

it is no longer only executing instructions. It is continuously deciding which instructions count.

That means the next practical design questions look more like:

  • which sources are authoritative?
  • which sources are merely informative?
  • which sources should never be allowed to change the task?
  • when should external content be treated as data instead of instruction?
  • when should the agent escalate because the context looks conflicted, stale, or adversarial?

This is why I think “keep a human in the loop” is still too vague.

A human at the end of the process may review the output. But the more important control may be earlier:

  • before the agent adopts a plan
  • before it lets retrieved material reshape the objective
  • before it lets a connected system override user intent
  • before it treats ambient content as instruction rather than evidence

That is context governance. And it is becoming a first-order management issue.

Who should care

1. Managers deploying shared agents

If your team is building shared agents for sales, support, finance, research, or internal ops, ask a better question than:

Can the agent access the right tools?

Also ask:

What sources are allowed to influence the agent’s decisions, and under what trust level?

If the answer is fuzzy, you probably have a context-governance problem even if your permissions look tidy.

2. Builders designing agent products

If your product involves browsing, retrieval, connected apps, Slack surfaces, or long-running workflows, you need more than app permissions and approval buttons.

You need a view on:

  • source trust ranking
  • instruction-vs-data separation
  • suspicious-context handling
  • escalation when retrieved context conflicts with user intent
  • auditability around why the agent followed one thread of guidance instead of another

The better product will not only say:

  • here is what the agent can do

It will also help users see:

  • here is what the agent trusted
  • here is what it treated as evidence only
  • here is where it refused to let outside content rewrite the task

3. Knowledge workers using agents personally

Even at the individual level, this matters.

If you use agents for research, planning, writing, or automation, the practical mistake is not only giving too much authority. It is assuming all retrieved context is neutral.

A polished answer built on contaminated context can feel persuasive while being strategically wrong.

What to do differently

Here is the operating stance I would use now.

1. Separate trusted instruction from ambient context

The agent should know the difference between:

  • user instruction
  • approved workflow rules
  • high-trust internal reference material
  • low-trust external content
  • untrusted content that should be read as data only

Do not let all context enter the system at the same status level.

2. Give sources trust tiers

Not every source should be equally allowed to shape behavior.

For example:

  • policy docs may define the workflow
  • tickets and threads may provide situational context
  • public web pages may provide evidence but not authority
  • unknown external content may require skepticism or explicit confirmation before it can alter a task

That sounds obvious. Many systems still do not behave this way.

3. Add checkpoints before context becomes action

The highest-value checkpoint is often not before the final external action. It is before the agent turns contested or ambiguous context into a plan.

A strong intervention point looks like:

  • here is the task I think I am solving
  • here are the sources I trusted most
  • here are the conflicts or suspicious instructions I saw
  • here is the proposed next step

That lets the human correct the frame before cleanup compounds.

4. Measure bad-trust incidents, not only bad actions

A lot of teams will track:

  • wrong sends
  • wrong edits
  • wrong API calls

They should also track:

  • wrong assumptions imported from retrieved context
  • stale or conflicting guidance followed by the agent
  • cases where external content tried to redirect the workflow
  • situations where the model should have escalated but did not

Those are earlier warnings. And they often matter more.

5. Design for explicit suspicion

Some agent systems still treat skepticism like a model trait. I think it increasingly has to be a workflow feature.

In other words:

  • some contexts should trigger stricter interpretation
  • some sources should never be allowed to redefine the task
  • some conflicts should force clarification rather than confident continuation

That is not paranoia. It is operational maturity.

The practical mistake to avoid

The mistake is not building shared agents. The mistake is assuming that permission design alone makes shared agents safe enough to trust.

That was already incomplete. It becomes even more incomplete once agents are:

  • shared across teams
  • embedded in Slack
  • connected to more apps
  • scheduled in the background
  • expected to prepare real work before a human checks it

At that point, the system is not only acting in your environment. It is also reasoning inside a messy mix of signals.

So the real question is no longer only:

What may this agent do?

It is also:

What should this agent distrust before it does it?

Working thesis

My current view is this:

The next management layer for agents is not only authority design. It is context trust design.

That means deciding:

  • which sources carry instruction authority
  • which sources are evidence only
  • where suspicion should increase
  • when conflicts should trigger escalation
  • how the agent shows its trust choices before work moves forward

OpenAI’s shared workspace agents and Google’s prompt-injection warning point to the same deeper shift.

Agents are moving into real work. And real work does not only require permissions. It requires judgment about what deserves belief.

That is a less glamorous story than another benchmark or launch demo. It is also the one more teams are about to run into.