Work

/flow-next:work walks a spec’s task list. For each ready task it spawns a worker subagent with fresh context. The worker re-anchors on the spec, implements, commits, optionally reviews, and marks the task done. Then the loop continues.

Worker subagent model

Each task runs in its own subagent. The benefits compound:

Fresh context per task prevents bleed between unrelated changes.
Re-anchor information stays bundled with its implementation.
Review cycles stay isolated to the task they belong to.
The main conversation stays lean even on long specs.

flowchart TB
  Main["Main session"] --> Pick["Pick next ready task"]
  Pick --> Spawn["Spawn worker subagent"]
  Spawn --> Anchor["Worker re-reads spec"]
  Anchor --> Impl["Implement + commit"]
  Impl --> Review["Optional review gate"]
  Review --> Done["Mark task done"]
  Done --> Sync["Plan-sync downstream tasks"]
  Sync --> Pick

The re-anchor is a single flowctl anchor <task-id> call: one deterministic bundle carrying the task, the parent spec, git state, memory and glossary indices, and dependency summaries — byte-equivalent to the discrete reads it replaced, so each worker starts in one round-trip instead of eight. The bundle is a floor, not a ceiling: the worker still reads anything else it needs.

Branch modes

Mode	Use
`current`	Stay on the active branch. Default for solo or small changes.
`new`	Create a fresh feature branch off the current base.
`worktree`	Fully isolated parallel work via `flow-next-worktree-kit`.

Worktree mode is the right choice when more than one spec is in flight, or when a review cycle on a separate branch must not disturb the work in progress.

Plan-sync after each task

When planSync.enabled is true in .flow/flow.yml, downstream task specs are checked for stale references after every completed task. If a task changed an API or path that later tasks assumed, the drift surfaces with a reason. The user decides whether to update the downstream specs, regenerate them, or accept the drift explicitly.

Plan-sync is conservative: it never silently rewrites a spec; it only proposes.

Completion review gate

When all tasks are done, an optional /flow-next:spec-completion-review runs to verify the combined implementation matches the spec end-to-end. This is the place to catch criteria that no individual task fully owned.

Optional review

/flow-next:work <spec-id> --review=codex

--review=codex|rp|copilot|none runs a per-task adversarial review. The worker loops fix → review until SHIP before marking the task done.

Codex delegation (opt-in)

flowctl config set work.delegate codex   # or pass delegate:codex as an arg
/flow-next:work <spec-id> delegate:codex

An optional mode where the Claude host stays the orchestrator — plan-reading, review, all git, and every decision — and delegates only the token-heavy code-writing to a local codex exec. It offloads work to a separate Codex budget rather than trimming prompts, so a long spec never eats the host’s context.

It is strictly opt-in and progressive-disclosure: with delegation off (the default), the work flow is byte-identical — one cheap config get, nothing else loads. Every mechanic below activates only when you turn it on.

Reasoning effort scales with risk

The host does not run Codex at a fixed setting. For each batch it picks the reasoning effort proportional to what that batch touches, then floors it at your configured minimum (work.delegateEffort, default medium). Routine work stays cheap; risky work automatically thinks harder.

What the batch changes	Effort
Ordinary CRUD, small refactor, docs	`medium`
Auth, sessions, payments, DB migrations, external APIs, retry/fallback logic	`high`
Architectural or cross-cutting changes	`xhigh`

The effort enum is none < low < medium < high < xhigh — a per-batch pick below your floor is raised to it, a pick at or above is kept. The model defaults to gpt-5.5 (work.delegateModel).

The orchestration split

Codex only ever writes code. It is forbidden from git (commit / push / PRs) and scoped to the repo — and that is enforced, not just prompted:

Git ownership — the worker captures the base commit and asserts HEAD is unchanged after every codex exec. A sandbox can technically run git, so the HEAD assertion is the real guard, not the instruction.
.flow/ integrity — the worker snapshots non-scratch .flow/ before delegating and restores any unauthorized write afterward. Codex may only touch its own .flow/tmp/codex-<task-id>/ scratch dir.
Claude keeps plan-reading, the review loop, all commits, impl-review, and marking the task done.

A delegated task is split into ≤5 logical units at phase/file boundaries (never splitting a shared file across batches), and trivial one-line changes skip delegation entirely — not worth the round-trip.

Verify-before-complete contract

Every batch returns a structured result (an --output-schema JSON: status, files modified, issues, summary, verification). Codex must run the batch’s tests in one process and may not report completed unless they pass — it verifies and self-fixes first. The host then classifies each result deterministically:

Codex result	Host action
`completed` (after a cheap trust cross-check)	commit
`partial`	keep the diff, finish + verify locally, then commit
`failed` / malformed result	scoped rollback, count one strike
non-zero exit	rollback and disable delegation for the rest of the run

Safety rails

One-time sandbox consent — the first delegated run asks once and persists the choice: yolo (default; full access incl. network, so it can install deps and run tests) or full-auto (workspace-write, tighter blast radius, no network).
MCP isolation — launched with --ignore-user-config so your Codex MCP servers don’t leak in (and don’t silently break the structured-output contract).
Recursion guard — won’t delegate if it is already running inside a Codex sandbox.
Scoped rollback — undoes only the failed batch’s files; never a bare git clean, and never touches .flow/ plan state.
Host-owned 3-strike circuit breaker — three consecutive failed tasks and delegation switches off for the rest of the run, falling back to standard in-session mode. The counter lives in the host loop (each task is a fresh worker subagent), so it actually persists across tasks.

Ralph and attribution

Delegation runs in interactive and autonomous (Ralph) mode — headless only needs consent pre-granted in config, and a ralph-guard hook allowlists the exact canonical invocation. Commits carry mixed-model attribution trailers, so the code Codex wrote is recorded as Codex’s.

Configuration

Key	Default	Meaning
`work.delegate`	`local` (off)	set to `codex` to turn delegation on
`work.delegateModel`	`gpt-5.5`	the Codex model to drive
`work.delegateEffort`	`medium`	the reasoning-effort floor
`work.delegateSandbox`	`yolo`	`yolo` or `full-auto`
`work.delegateDecision`	`ask`	`ask` per task, or `auto` to delegate eligible tasks unprompted

See the flowctl reference for the full key list.

Next step

/flow-next:make-pr <spec-id>