Essay / Note

The next software management job is deciding what gets automated

As coding agents move from assistance toward automation, the practical management question is no longer just whether developers use AI. It is which classes of work should be automated, supervised, staged, or kept human-led.

By Mada • Apr 26, 2026

A useful data point slipped past a lot of the usual model chatter this week.

Anthropic’s latest Economic Index on software development found that 79% of Claude Code conversations looked like automation rather than augmentation. In plain English: a lot of people are no longer just asking AI to help them think. They are asking it to do chunks of the work.

That matters more than another benchmark screenshot.

Because once coding agents are used mainly for automation, the management question changes.

It is no longer just:

Should our developers use AI?

It becomes:

Which kinds of software work should be automated, which should be supervised closely, and which should stay firmly human-led?

I think a lot of teams are still underreacting to that shift.

What changed

Two things now seem clearer.

First, coding agents are moving from assistive surfaces toward delegated execution.

Anthropic’s data is one signal. The broader product market is another. The tools being packaged around agents now assume longer-running tasks, tool use, repository context, environment access, and multi-step execution rather than just code suggestions in a text box.

Second, the adoption pattern still looks uneven.

Anthropic’s same report suggests startups are much earlier and more aggressive adopters of Claude Code than enterprises. That tracks with what you would expect. Smaller teams can accept more workflow mess, narrower controls, and faster experimentation. Larger organizations usually cannot.

That gap is important.

It does not mean enterprises are behind because they are stupid. It often means the real problem is no longer raw model access. It is operational design:

what the agent is allowed to touch
which tasks it can complete end to end
where review belongs
what counts as safe enough to automate
who owns the failure when the automation is wrong

That is a management design problem, not just a tooling problem.

What people are overreacting to

I think people are still overreacting to the simple headline:

AI can now code more of the software lifecycle.

Yes, it can. But that statement is too crude to be useful.

The mistake is jumping from “the agent is impressive” to “we should automate as much development work as possible.”

Software work is not one thing.

Some tasks are:

bounded
reversible
easy to test
easy to diff
low-cost to redo

Those are strong candidates for automation.

Other tasks are:

architecture-shaping
dependency-sensitive
politically cross-functional
security-relevant
hard to evaluate until much later

Those are not the same kind of work.

If a team treats both categories as equally automatable, it will confuse model capability with managerial judgment.

What people are underreacting to

The underreaction is that teams now need task classes for automation.

Not just AI policies. Not just approved tools. Not just “engineers may use copilots.”

Task classes.

In other words, a team should be able to say something like:

Class 1 — Safe to automate

Good candidates:

test generation for well-understood modules
routine refactors with strong diff visibility
documentation drafts
boilerplate UI work
simple migration scripts with clear rollback

Class 2 — Automate, but with staged review

Good candidates:

multi-file feature implementation
bug fixing in familiar code paths
data transformation jobs
support tooling
infra changes in non-production environments

Class 3 — Human-led, AI-assisted

Good candidates:

architecture decisions
permission model changes
sensitive customer workflows
production incident response
changes with large blast radius or unclear evaluation

That structure matters because it shifts the conversation away from vague tool enthusiasm and toward governed delegation.

And that is where the real leverage is.

Why this matters for managers

If you manage software teams, I think one of the least useful questions you can ask now is:

Are we using coding agents enough?

That question invites status theater.

People can answer it with screenshots, anecdotes, and noisy productivity claims. But it does not tell you whether the team is automating the right things.

The more useful questions are:

What work are we comfortable automating end to end?
What work requires plan approval before execution?
What work should stay human-led because evaluation happens too late or failure cost is too high?
Where do we expect the agent to prepare work, not finish it?
Which mistakes are cheap, and which ones create hidden cleanup?

Those questions are much more operationally honest.

They also reveal something important.

The next management job is not just tool adoption. It is automation boundary design.

That includes:

task classification
review placement
repo and environment permissions
rollback expectations
escalation rules
evaluation standards by work type

That is how teams escape the false choice between:

fully manual software work
blind faith in coding agents

Why this matters for builders

If you are building coding-agent products, there is a real lesson here too.

A lot of the market still sells the dream of a general software worker. But real trust usually grows through narrower, better-defined operating envelopes.

The strongest products will not only say:

here is a powerful coding agent

They will also help teams answer:

what sort of work is this agent fit for?
what approval shape should this task use?
what context should be visible before execution?
what is reversible?
what is too risky to batch blindly?

In other words, the better product will often be the one that helps a team classify work, not just generate more of it.

A practical operating model

If I were managing a software team using coding agents right now, I would not start by saying:

Everyone should use the agent more.

I would start with a lightweight automation map.

For each recurring task type, define:

1. Task shape

Is it predictable, repetitive, and easy to inspect?

2. Blast radius

If the agent gets this wrong, what actually breaks?

3. Evaluation speed

Can we tell quickly whether the work is good, or will the mistake surface much later?

4. Required context

Does success require local code understanding only, or broader system and business judgment?

5. Approval mode

Should the agent:

execute directly
prepare and wait
propose options only
stay out of this class entirely

That gives you a much better operating model than generalized hype about AI coding productivity.

The practical mistake to avoid

The mistake is not adopting coding agents. The mistake is adopting them without a view of which work deserves which level of automation.

That is how teams end up with one of two bad outcomes.

Failure mode 1: underuse

The organization gets spooked and restricts AI to trivial autocomplete-style assistance, leaving a lot of useful automation on the table.

Failure mode 2: overdelegation

The organization lets agents roam across large, messy, weakly evaluated tasks, then discovers too late that the cleanup burden erased the speed gains.

The right answer is usually in the middle.

Not generic caution. Not blind acceleration.

Structured delegation.

Working thesis

My current view is this:

The next software management job is not deciding whether AI can code. It is deciding what kinds of software work should be automated, staged, supervised, or kept human-led.

Anthropic’s data matters because it suggests the transition is already underway.

Coding agents are no longer just another way to brainstorm. They are becoming a delegation layer.

And once that happens, the practical winners will not be the teams with the loudest AI enthusiasm. They will be the teams that get more precise about automation boundaries, review design, and task classes.

That is a much less glamorous story than “AI writes code now.”

It is also the more useful one.