Skip to content

Work

/flow-next:work walks a spec’s task list. For each ready task it spawns a worker subagent with fresh context. The worker re-anchors on the spec, implements, commits, optionally reviews, and marks the task done. Then the loop continues.

Each task runs in its own subagent. The benefits compound:

  • Fresh context per task prevents bleed between unrelated changes.
  • Re-anchor information stays bundled with its implementation.
  • Review cycles stay isolated to the task they belong to.
  • The main conversation stays lean even on long specs.
flowchart TB
  Main["Main session"] --> Pick["Pick next ready task"]
  Pick --> Spawn["Spawn worker subagent"]
  Spawn --> Anchor["Worker re-reads spec"]
  Anchor --> Impl["Implement + commit"]
  Impl --> Review["Optional review gate"]
  Review --> Done["Mark task done"]
  Done --> Sync["Plan-sync downstream tasks"]
  Sync --> Pick

The re-anchor is a single flowctl anchor <task-id> call: one deterministic bundle carrying the task, the parent spec, git state, memory and glossary indices, and dependency summaries — byte-equivalent to the discrete reads it replaced, so each worker starts in one round-trip instead of eight. The bundle is a floor, not a ceiling: the worker still reads anything else it needs.

ModeUse
currentStay on the active branch. Default for solo or small changes.
newCreate a fresh feature branch off the current base.
worktreeFully isolated parallel work via flow-next-worktree-kit.

Worktree mode is the right choice when more than one spec is in flight, or when a review cycle on a separate branch must not disturb the work in progress.

When planSync.enabled is true in .flow/flow.yml, downstream task specs are checked for stale references after every completed task. If a task changed an API or path that later tasks assumed, the drift surfaces with a reason. The user decides whether to update the downstream specs, regenerate them, or accept the drift explicitly.

Plan-sync is conservative: it never silently rewrites a spec; it only proposes.

When all tasks are done, an optional /flow-next:spec-completion-review runs to verify the combined implementation matches the spec end-to-end. This is the place to catch criteria that no individual task fully owned.

Terminal window
/flow-next:work <spec-id> --review=codex

--review=codex|rp|copilot|none runs a per-task adversarial review. The worker loops fix → review until SHIP before marking the task done.

Terminal window
flowctl config set work.delegate codex # or pass delegate:codex as an arg
/flow-next:work <spec-id> delegate:codex

An optional mode where the Claude host stays the orchestrator — plan-reading, review, all git, and every decision — and delegates only the token-heavy code-writing to a local codex exec. It offloads work to a separate Codex budget rather than trimming prompts, so a long spec never eats the host’s context.

It is strictly opt-in and progressive-disclosure: with delegation off (the default), the work flow is byte-identical — one cheap config get, nothing else loads. Every mechanic below activates only when you turn it on.

The host does not run Codex at a fixed setting. For each batch it picks the reasoning effort proportional to what that batch touches, then floors it at your configured minimum (work.delegateEffort, default medium). Routine work stays cheap; risky work automatically thinks harder.

What the batch changesEffort
Ordinary CRUD, small refactor, docsmedium
Auth, sessions, payments, DB migrations, external APIs, retry/fallback logichigh
Architectural or cross-cutting changesxhigh

The effort enum is none < low < medium < high < xhigh — a per-batch pick below your floor is raised to it, a pick at or above is kept. The model defaults to gpt-5.5 (work.delegateModel).

Codex only ever writes code. It is forbidden from git (commit / push / PRs) and scoped to the repo — and that is enforced, not just prompted:

  • Git ownership — the worker captures the base commit and asserts HEAD is unchanged after every codex exec. A sandbox can technically run git, so the HEAD assertion is the real guard, not the instruction.
  • .flow/ integrity — the worker snapshots non-scratch .flow/ before delegating and restores any unauthorized write afterward. Codex may only touch its own .flow/tmp/codex-<task-id>/ scratch dir.
  • Claude keeps plan-reading, the review loop, all commits, impl-review, and marking the task done.

A delegated task is split into ≤5 logical units at phase/file boundaries (never splitting a shared file across batches), and trivial one-line changes skip delegation entirely — not worth the round-trip.

Every batch returns a structured result (an --output-schema JSON: status, files modified, issues, summary, verification). Codex must run the batch’s tests in one process and may not report completed unless they pass — it verifies and self-fixes first. The host then classifies each result deterministically:

Codex resultHost action
completed (after a cheap trust cross-check)commit
partialkeep the diff, finish + verify locally, then commit
failed / malformed resultscoped rollback, count one strike
non-zero exitrollback and disable delegation for the rest of the run
  • One-time sandbox consent — the first delegated run asks once and persists the choice: yolo (default; full access incl. network, so it can install deps and run tests) or full-auto (workspace-write, tighter blast radius, no network).
  • MCP isolation — launched with --ignore-user-config so your Codex MCP servers don’t leak in (and don’t silently break the structured-output contract).
  • Recursion guard — won’t delegate if it is already running inside a Codex sandbox.
  • Scoped rollback — undoes only the failed batch’s files; never a bare git clean, and never touches .flow/ plan state.
  • Host-owned 3-strike circuit breaker — three consecutive failed tasks and delegation switches off for the rest of the run, falling back to standard in-session mode. The counter lives in the host loop (each task is a fresh worker subagent), so it actually persists across tasks.

Delegation runs in interactive and autonomous (Ralph) mode — headless only needs consent pre-granted in config, and a ralph-guard hook allowlists the exact canonical invocation. Commits carry mixed-model attribution trailers, so the code Codex wrote is recorded as Codex’s.

KeyDefaultMeaning
work.delegatelocal (off)set to codex to turn delegation on
work.delegateModelgpt-5.5the Codex model to drive
work.delegateEffortmediumthe reasoning-effort floor
work.delegateSandboxyoloyolo or full-auto
work.delegateDecisionaskask per task, or auto to delegate eligible tasks unprompted

See the flowctl reference for the full key list.

Terminal window
/flow-next:make-pr <spec-id>