Skip to content

Execution & Subagents

Each task runs in a fresh context. No contamination. No assumptions. No inherited confusion.


Why Fresh Subagents?

When an AI agent works on a long task, it accumulates context — previous decisions, earlier errors, half-formed assumptions from steps it took an hour ago. By the time it reaches task 15 in a plan, it may be subtly (or not so subtly) influenced by everything that happened in tasks 1 through 14.

This is called context contamination, and it's one of the primary causes of cascading failures in AI-assisted development. A misunderstanding in task 3 silently propagates. An incorrect assumption made in task 7 shapes the code written in tasks 8, 9, and 10. By the time anyone notices something is wrong, untangling it is expensive.

Superpowers solves this with subagents: fresh AI instances with no memory of previous tasks. Each task is dispatched to a clean context. Whatever happened before doesn't exist. The subagent reads the task, reads the relevant code, and executes exactly what the task says — nothing more, nothing less.

The Three Benefits of Context Isolation

  1. No error propagation — A mistake in task 3 doesn't affect task 4. Each task either succeeds or fails on its own.

  2. Unbiased review — The agent reviewing a completed task has no attachment to the code it's reviewing. It didn't write it. It has no reason to rationalize problems away.

  3. Parallel execution safety — Tasks that don't depend on each other can be dispatched simultaneously. Fresh contexts don't conflict.


The Controller Process

The controller is the main orchestrating agent — it manages the plan execution loop. Here's how it works:

┌──────────────────────────────────────────────────────────────┐
│                     CONTROLLER LOOP                          │
│                                                              │
│  1. Load plan → identify next PENDING task                   │
│  2. Dispatch subagent with: task description + relevant code │
│  3. Subagent completes task → returns result + status        │
│  4. Controller runs TWO-STAGE REVIEW                         │
│  5. Handle status:                                           │
│     DONE           → mark task complete, continue           │
│     DONE_WITH_CONCERNS → log, continue with note            │
│     NEEDS_CONTEXT  → gather context, re-dispatch            │
│     BLOCKED        → stop, escalate to user                 │
└──────────────────────────────────────────────────────────────┘

Loading the Plan

The controller reads the full plan document and identifies the first task with Status: PENDING. It doesn't skip ahead, doesn't reorder, and doesn't decide to "combine" tasks for efficiency. The plan is the contract — the controller executes the contract.

Dispatching a Subagent

The controller creates a subagent with exactly the information the subagent needs:

  • The task description
  • The exact code to write (from the plan)
  • The exact files to modify
  • The exact commands to run

Nothing extra. No history of previous tasks (unless the task explicitly requires it as context). The subagent's job is to execute one task cleanly.


Two-Stage Review

After a subagent completes a task, the controller does not immediately accept the result. It runs a two-stage review:

Stage 1: Spec Compliance

"Did the subagent build what the task specified?"

The controller compares the subagent's output against the task description:

  • Was the correct file modified?
  • Was the correct function created with the correct signature?
  • Were the required tests written?
  • Were any out-of-scope changes made?

If the output doesn't match the spec, the task is not complete. The controller notes the deviation and either re-dispatches with clarification or escalates to the user.

Stage 2: Code Quality

"Is the code correct, readable, and maintainable?"

This is evaluated independently of spec compliance. A subagent can perfectly match the spec and still produce poor code:

  • Missing error handling for obvious cases
  • Incorrect logic that happens to pass the specified test
  • Naming conventions that don't match the codebase
  • Security issues

Code quality concerns are logged separately from compliance issues. A task can be DONE_WITH_CONCERNS — technically complete but with notes that the reviewer should address.


Status Protocol

Every completed task returns one of four statuses:

DONE

The task is complete. The code matches the spec. Quality is acceptable. No concerns. Continue to the next task.

DONE_WITH_CONCERNS

The task is technically complete and the tests pass, but the reviewing agent has flagged something worth human attention. Examples:

  • The implementation works but uses a deprecated API
  • The test coverage is technically passing but is weak
  • A minor naming inconsistency was introduced

Work continues, but the concern is logged for human review at the end of the plan or at the next review checkpoint.

NEEDS_CONTEXT

The task could not be completed because the subagent encountered something it couldn't resolve with the information provided. This is not a failure — it's an honest report that more information is needed. Examples:

  • The specified file doesn't exist and the task doesn't say to create it
  • An imported module is not in the codebase
  • The task references a function signature that doesn't match what exists

The controller gathers the missing context and re-dispatches.

BLOCKED

Something is fundamentally wrong that prevents progress. Examples:

  • A dependency task was marked DONE but the code it was supposed to create doesn't exist
  • A test is failing in a way that can't be explained by the current task
  • A design assumption in the plan turns out to be incorrect

When a task returns BLOCKED, the controller stops execution and escalates to the user. This is a forcing function to prevent the AI from making unilateral decisions about plan-level problems.


Model Selection by Complexity

Not all tasks require the same level of AI capability. Superpowers recommends matching task complexity to model tier:

Task TypeExamplesRecommended Model Tier
MechanicalAdd import, rename variable, update config valueSmaller/faster model
IntegrationWire together two existing systems, add a new API endpointMid-tier model
ArchitectureDesign a new abstraction, refactor a core data modelMost capable model

Using a smaller model for mechanical tasks reduces cost and latency. Using the most capable model for architecture ensures the important decisions get the best reasoning. Task complexity is typically described in the plan or can be inferred from the task description.


Parallel Agent Dispatching

Some tasks in a plan are independent — completing task 4 doesn't depend on task 5 being done. In these cases, the controller can dispatch multiple subagents simultaneously.

Rules for Parallel Dispatching

  1. No shared state — Tasks being run in parallel must not write to the same file or modify the same database record
  2. No sequential dependency — Task B must not need the output of Task A if they run in parallel
  3. Explicit declaration — Parallel tasks should be marked in the plan ([PARALLEL_GROUP: A]) so the controller knows they're safe to run together
  4. Review still applies — Each parallel task goes through the two-stage review independently when it completes
### Task 6: Write auth middleware test [PARALLEL_GROUP: 1]
### Task 7: Write order validator test [PARALLEL_GROUP: 1]
### Task 8: Write cart service test [PARALLEL_GROUP: 1]

// Tasks 6, 7, and 8 can be dispatched simultaneously.
// Task 9 must wait for all three to complete.

When NOT to Parallelize

  • When tasks modify the same file (merge conflicts are expensive)
  • When the output of one task changes the environment for another
  • When you're not sure (default to sequential when uncertain)

Fallback: Executing Plans Without Subagents

Some environments don't support subagent dispatching — the AI tool runs in a single-context mode without the ability to spawn independent processes.

In these cases, use the executing-plans skill (not the subagent dispatch flow). The executing-plans skill simulates subagent discipline within a single context by:

  • Treating each task as a clean unit with explicit context resets
  • Applying the same two-stage review after each task
  • Using the same status protocol
  • Never looking ahead to future tasks while executing the current one

The results are slightly less isolated than true subagents, but the discipline of the process provides most of the same benefits.

/executing-plans [path to plan document]

Common Execution Failures and How to Prevent Them

FailureCausePrevention
Task completes but tests don't passSubagent verified the file exists, not that tests passReview gate must run tests, not just check file existence
DONE_WITH_CONCERNS accumulateController keeps running despite concernsSet a threshold — if N concerns accumulate, pause and review
BLOCKED task is skippedController tries to work around a blocked taskBLOCKED always stops execution — no workarounds
Parallel tasks conflictTwo tasks write to the same fileMark parallel groups explicitly; review for shared state before dispatching
Plan deviates during executionSubagent "improves" on the planSubagents must follow the plan exactly, not interpret or improve it

The subagent model is borrowed from good engineering management: give each engineer a clear, self-contained task, review their work before it merges, and never let one engineer's confusion become the whole team's problem.