6. Subagents and delegation¶
A subagent is a fresh agent the main agent launches to handle a sub-task in isolation. Three reasons to use them: parallelization (running independent sub-tasks at the same time — searches, analyses, builds, etc.), context protection (keep the main thread clean from large noisy outputs), and autonomous orchestration (an agent that coordinates other agents to drive a larger task end-to-end).
What you'll learn¶
- The three real reasons to delegate to a subagent.
- When to do the work inline in the main thread instead.
- How to brief a subagent so it actually returns something useful.
- How multi-agent orchestration scales a single human's attention across long-running tasks.
- The single failure mode that ruins delegation: outsourcing the thinking that was yours to do.
The three real reasons¶
1. Context protection¶
The main agent's conversation is precious. Every token spent on a 4,000-line grep output is a token that's not spent on the actual problem. A subagent can run that grep, read the matches, and return one paragraph of synthesis. The noise stays in the subagent's context and dies with it.
2. Parallelization¶
Three independent questions — "where is auth handled?", "what does the user model look like?", "how are emails sent?" — can be answered by three subagents at the same time. Doing them sequentially in the main thread is just slower for no benefit.
3. Autonomous orchestration¶
An agent can be built specifically to coordinate other agents — see "Multi-agent orchestration" below. This is qualitatively different from the first two: it's not about protecting one human-driven conversation, it's about scaling a single human's attention across work that would otherwise need constant supervision.
Two reasons protect the main context, one extends it
Parallelization and context protection are about a human-driven session staying sharp. Orchestration is about removing the human from the inner loop entirely for well-defined, verifiable tasks. If none of the three applies, you don't need a subagent.
The anti-reason¶
Don't delegate because it sounds fancy
Delegating "rename this variable" to a subagent is slower, more expensive, and harder to debug than just doing it. The cost of spinning up a subagent — briefing, waiting, parsing the response — is real. Pay it only when the savings are real.
Inline vs delegate: a cheat sheet¶
| Situation | Do it inline | Delegate to subagent |
|---|---|---|
| Edit a file you're already looking at | Yes | No |
| Search 200 files for a pattern | No | Yes — return only matches |
| Read three unrelated files in parallel | No | Yes — one subagent each |
| Decide which approach to take | Yes | No — never delegate the decision |
| Summarize a 2,000-line log file | No | Yes — return the summary |
| Write the final code change | Yes | No — you own the synthesis |
| Coordinate a multi-step task end-to-end with verification | No | Yes — orchestrator pattern (see below) |
The pattern for human-driven sessions: delegate the gathering, keep the deciding. The pattern for orchestration: delegate the loop, keep the goal and the acceptance check.
Briefing a subagent¶
A subagent doesn't share your conversation. It walks in cold. Brief it the way you'd brief a colleague who just sat down at your desk: goal, context, constraints, expected output.
A concrete example prompt¶
Suppose the main task is "add rate limiting to the public API." Before writing code, you want to know how the existing middleware stack is organized. Here's a good subagent brief:
Goal: Map the HTTP middleware stack of this Spring Boot service so I can
decide where to insert a rate-limiter.
Context:
- Repo root: /Users/me/work/payments-api
- Framework: Spring Boot 3.3, Java 21
- I am about to add a rate limiter for the /v1/public/** routes
- I have NOT yet decided between Bucket4j and Resilience4j
Please investigate:
1. List every class annotated with @Configuration that registers a
filter, interceptor, or WebMvcConfigurer customization.
2. For each, note the order (Ordered / @Order) and which URL patterns
it applies to.
3. Identify the current authentication filter and where in the chain
it sits.
Constraints:
- Read-only. Do not modify any files.
- Do not recommend a library — I'll decide that myself.
- Budget: ~5 minutes of exploration.
Expected output:
- A table: filter/interceptor name -> order -> URL pattern -> file path
- A 3-sentence summary of where a new filter would naturally fit
- Nothing else
Notice the structure: a single goal, the context the subagent needs (and only that), explicit constraints (read-only, no library recommendation), and a precise output format. The subagent will return ~30 useful lines instead of 3,000 noisy ones.
If you can't write the brief, you're not ready to delegate
A vague brief produces a vague answer. If you struggle to specify the expected output, the task probably isn't well-defined enough to delegate yet.
Parallel delegation¶
When sub-tasks are independent, fire them in parallel in a single turn. A typical "explore an unfamiliar codebase" turn might launch three subagents at once:
- One mapping the HTTP layer.
- One mapping the persistence layer.
- One mapping the background jobs.
Each returns a summary; the main agent stitches them together. The whole exploration finishes in the time of the slowest subagent, not the sum.
Specialized subagents¶
Some teams define named subagents for common roles: a research subagent (read-only, web + code search), a planner (produces a plan, writes no code), a code-search subagent (greps, reads, returns paths and snippets). Specialization is useful when the role is reused often enough to justify a stable brief — otherwise, an ad-hoc subagent with a good prompt is fine.
Multi-agent orchestration: agents that call agents¶
Everything above describes a human driving a single agent that occasionally delegates. But the same primitive composes one level up: an agent can itself be built as an orchestrator whose main job is to call other agents, read their outputs, decide what to do next, and run the loop autonomously until a larger task is done. The human briefs the orchestrator once; the orchestrator coordinates the rest.
A realistic shape:
- An orchestrator agent receives "ship a new endpoint for X".
- It calls a research subagent to map the existing code.
- It calls a planner subagent to produce a step-by-step plan.
- It calls a coder subagent to implement each step.
- It calls a reviewer subagent (or runs tests/lint as tools) to verify.
- If the reviewer fails, it loops back to the coder with the feedback.
- Only when the verification loop closes does it hand back to the human.
Each subagent stays narrow and stateless; the orchestrator carries the high-level intent and the synthesis. This is how you scale a single human's attention across long-running, multi-step work — and it's the natural bridge between "I drive the agent" and "the agent runs an SDLC phase end-to-end".
Orchestration multiplies both leverage and blast radius
A subagent that ships broken code wastes minutes. An autonomous orchestrator that ships broken code for an hour wastes an hour. Multi-agent setups need stronger verification loops, clearer stop conditions, and tighter sandboxing — not weaker. Don't reach for orchestration until your single-agent verification story is solid (see chapter 1).
The fatal failure mode¶
Never delegate the synthesis that was yours to do
In a human-driven session, the moment you ask a subagent "...and then decide which approach we should take and start implementing it," you've handed away the most important part of the job. The subagent doesn't have your full context, doesn't know the trade-offs you've already weighed, and doesn't carry the responsibility for the result.
Orchestration looks like an exception but isn't: an orchestrator does synthesize across its subagents — that's its job. What stays with the human is one level up: deciding what problem to solve, what "done" means, and how success will be verified. Delegate that, and no amount of orchestration will save you.
Key takeaway¶
Key takeaway
Subagents are for protecting context, parallelizing, and orchestrating autonomous work — not for offloading the thinking that defines the task. Delegate the gathering and the loop; keep the goal, the trade-offs, and the acceptance check.