Execution & Subagents

Each task runs in a fresh context. No contamination. No assumptions. No inherited confusion.

Why Fresh Subagents?

When an AI agent works on a long task, it accumulates context — previous decisions, earlier errors, half-formed assumptions from steps it took an hour ago. By the time it reaches task 15 in a plan, it may be subtly (or not so subtly) influenced by everything that happened in tasks 1 through 14.

This is called context contamination, and it's one of the primary causes of cascading failures in AI-assisted development. A misunderstanding in task 3 silently propagates. An incorrect assumption made in task 7 shapes the code written in tasks 8, 9, and 10. By the time anyone notices something is wrong, untangling it is expensive.

Superpowers solves this with subagents: fresh AI instances with no memory of previous tasks. Each task is dispatched to a clean context. Whatever happened before doesn't exist. The subagent reads the task, reads the relevant code, and executes exactly what the task says — nothing more, nothing less.

The Three Benefits of Context Isolation

No error propagation — A mistake in task 3 doesn't affect task 4. Each task either succeeds or fails on its own.
Unbiased review — The agent reviewing a completed task has no attachment to the code it's reviewing. It didn't write it. It has no reason to rationalize problems away.
Parallel execution safety — Tasks that don't depend on each other can be dispatched simultaneously. Fresh contexts don't conflict.

The Controller Process

The controller is the main orchestrating agent — it manages the plan execution loop. Here's how it works:

┌──────────────────────────────────────────────────────────────┐
│                     CONTROLLER LOOP                          │
│                                                              │
│  1. Load plan → identify next PENDING task                   │
│  2. Dispatch subagent with: task description + relevant code │
│  3. Subagent completes task → returns result + status        │
│  4. Controller runs TWO-STAGE REVIEW                         │
│  5. Handle status:                                           │
│     DONE           → mark task complete, continue           │
│     DONE_WITH_CONCERNS → log, continue with note            │
│     NEEDS_CONTEXT  → gather context, re-dispatch            │
│     BLOCKED        → stop, escalate to user                 │
└──────────────────────────────────────────────────────────────┘

Loading the Plan

The controller reads the full plan document and identifies the first task with Status: PENDING. It doesn't skip ahead, doesn't reorder, and doesn't decide to "combine" tasks for efficiency. The plan is the contract — the controller executes the contract.

Dispatching a Subagent

The controller creates a subagent with exactly the information the subagent needs:

The task description
The exact code to write (from the plan)
The exact files to modify
The exact commands to run

Nothing extra. No history of previous tasks (unless the task explicitly requires it as context). The subagent's job is to execute one task cleanly.

Two-Stage Review

After a subagent completes a task, the controller does not immediately accept the result. It runs a two-stage review:

Stage 1: Spec Compliance

"Did the subagent build what the task specified?"

The controller compares the subagent's output against the task description:

Was the correct file modified?
Was the correct function created with the correct signature?
Were the required tests written?
Were any out-of-scope changes made?

If the output doesn't match the spec, the task is not complete. The controller notes the deviation and either re-dispatches with clarification or escalates to the user.

Stage 2: Code Quality

"Is the code correct, readable, and maintainable?"

This is evaluated independently of spec compliance. A subagent can perfectly match the spec and still produce poor code:

Missing error handling for obvious cases
Incorrect logic that happens to pass the specified test
Naming conventions that don't match the codebase
Security issues

Code quality concerns are logged separately from compliance issues. A task can be DONE_WITH_CONCERNS — technically complete but with notes that the reviewer should address.

Status Protocol

Every completed task returns one of four statuses:

`DONE`

The task is complete. The code matches the spec. Quality is acceptable. No concerns. Continue to the next task.

`DONE_WITH_CONCERNS`

The task is technically complete and the tests pass, but the reviewing agent has flagged something worth human attention. Examples:

The implementation works but uses a deprecated API
The test coverage is technically passing but is weak
A minor naming inconsistency was introduced

Work continues, but the concern is logged for human review at the end of the plan or at the next review checkpoint.

`NEEDS_CONTEXT`

The task could not be completed because the subagent encountered something it couldn't resolve with the information provided. This is not a failure — it's an honest report that more information is needed. Examples:

The specified file doesn't exist and the task doesn't say to create it
An imported module is not in the codebase
The task references a function signature that doesn't match what exists

The controller gathers the missing context and re-dispatches.

`BLOCKED`

Something is fundamentally wrong that prevents progress. Examples:

A dependency task was marked DONE but the code it was supposed to create doesn't exist
A test is failing in a way that can't be explained by the current task
A design assumption in the plan turns out to be incorrect

When a task returns BLOCKED, the controller stops execution and escalates to the user. This is a forcing function to prevent the AI from making unilateral decisions about plan-level problems.

Model Selection by Complexity

Not all tasks require the same level of AI capability. Superpowers recommends matching task complexity to model tier:

Task Type	Examples	Recommended Model Tier
Mechanical	Add import, rename variable, update config value	Smaller/faster model
Integration	Wire together two existing systems, add a new API endpoint	Mid-tier model
Architecture	Design a new abstraction, refactor a core data model	Most capable model

Using a smaller model for mechanical tasks reduces cost and latency. Using the most capable model for architecture ensures the important decisions get the best reasoning. Task complexity is typically described in the plan or can be inferred from the task description.

Parallel Agent Dispatching

Some tasks in a plan are independent — completing task 4 doesn't depend on task 5 being done. In these cases, the controller can dispatch multiple subagents simultaneously.

Rules for Parallel Dispatching

No shared state — Tasks being run in parallel must not write to the same file or modify the same database record
No sequential dependency — Task B must not need the output of Task A if they run in parallel
Explicit declaration — Parallel tasks should be marked in the plan ([PARALLEL_GROUP: A]) so the controller knows they're safe to run together
Review still applies — Each parallel task goes through the two-stage review independently when it completes

### Task 6: Write auth middleware test [PARALLEL_GROUP: 1]
### Task 7: Write order validator test [PARALLEL_GROUP: 1]
### Task 8: Write cart service test [PARALLEL_GROUP: 1]

// Tasks 6, 7, and 8 can be dispatched simultaneously.
// Task 9 must wait for all three to complete.

When NOT to Parallelize

When tasks modify the same file (merge conflicts are expensive)
When the output of one task changes the environment for another
When you're not sure (default to sequential when uncertain)

Fallback: Executing Plans Without Subagents

Some environments don't support subagent dispatching — the AI tool runs in a single-context mode without the ability to spawn independent processes.

In these cases, use the executing-plans skill (not the subagent dispatch flow). The executing-plans skill simulates subagent discipline within a single context by:

Treating each task as a clean unit with explicit context resets
Applying the same two-stage review after each task
Using the same status protocol
Never looking ahead to future tasks while executing the current one

The results are slightly less isolated than true subagents, but the discipline of the process provides most of the same benefits.

/executing-plans [path to plan document]

Common Execution Failures and How to Prevent Them

Failure	Cause	Prevention
Task completes but tests don't pass	Subagent verified the file exists, not that tests pass	Review gate must run tests, not just check file existence
DONE_WITH_CONCERNS accumulate	Controller keeps running despite concerns	Set a threshold — if N concerns accumulate, pause and review
BLOCKED task is skipped	Controller tries to work around a blocked task	BLOCKED always stops execution — no workarounds
Parallel tasks conflict	Two tasks write to the same file	Mark parallel groups explicitly; review for shared state before dispatching
Plan deviates during execution	Subagent "improves" on the plan	Subagents must follow the plan exactly, not interpret or improve it

The subagent model is borrowed from good engineering management: give each engineer a clear, self-contained task, review their work before it merges, and never let one engineer's confusion become the whole team's problem.