Work
/flow-next:work walks a spec’s task list. For each ready task it spawns a worker subagent with fresh context. The worker re-anchors on the spec, implements, commits, optionally reviews, and marks the task done. Then the loop continues.
Worker subagent model
Section titled “Worker subagent model”Each task runs in its own subagent. The benefits compound:
- Fresh context per task prevents bleed between unrelated changes.
- Re-anchor information stays bundled with its implementation.
- Review cycles stay isolated to the task they belong to.
- The main conversation stays lean even on long specs.
flowchart TB Main["Main session"] --> Pick["Pick next ready task"] Pick --> Spawn["Spawn worker subagent"] Spawn --> Anchor["Worker re-reads spec"] Anchor --> Impl["Implement + commit"] Impl --> Review["Optional review gate"] Review --> Done["Mark task done"] Done --> Sync["Plan-sync downstream tasks"] Sync --> Pick
The re-anchor is a single flowctl anchor <task-id> call: one deterministic bundle carrying the task, the parent spec, git state, memory and glossary indices, and dependency summaries — byte-equivalent to the discrete reads it replaced, so each worker starts in one round-trip instead of eight. The bundle is a floor, not a ceiling: the worker still reads anything else it needs.
Branch modes
Section titled “Branch modes”| Mode | Use |
|---|---|
current | Stay on the active branch. Default for solo or small changes. |
new | Create a fresh feature branch off the current base. |
worktree | Fully isolated parallel work via flow-next-worktree-kit. |
Worktree mode is the right choice when more than one spec is in flight, or when a review cycle on a separate branch must not disturb the work in progress.
Plan-sync after each task
Section titled “Plan-sync after each task”When planSync.enabled is true in .flow/flow.yml, downstream task specs are checked for stale references after every completed task. If a task changed an API or path that later tasks assumed, the drift surfaces with a reason. The user decides whether to update the downstream specs, regenerate them, or accept the drift explicitly.
Plan-sync is conservative: it never silently rewrites a spec; it only proposes.
Completion review gate
Section titled “Completion review gate”When all tasks are done, an optional /flow-next:spec-completion-review runs to verify the combined implementation matches the spec end-to-end. This is the place to catch criteria that no individual task fully owned.
Optional review
Section titled “Optional review”/flow-next:work <spec-id> --review=codex--review=codex|rp|copilot|none runs a per-task adversarial review. The worker loops fix → review until SHIP before marking the task done.
Codex delegation (opt-in)
Section titled “Codex delegation (opt-in)”flowctl config set work.delegate codex # or pass delegate:codex as an arg/flow-next:work <spec-id> delegate:codexAn optional mode where the Claude host stays the orchestrator — plan-reading, review, all git, and every decision — and delegates only the token-heavy code-writing to a local codex exec. It offloads work to a separate Codex budget rather than trimming prompts, so a long spec never eats the host’s context.
It is strictly opt-in and progressive-disclosure: with delegation off (the default), the work flow is byte-identical — one cheap config get, nothing else loads. Every mechanic below activates only when you turn it on.
Reasoning effort scales with risk
Section titled “Reasoning effort scales with risk”The host does not run Codex at a fixed setting. For each batch it picks the reasoning effort proportional to what that batch touches, then floors it at your configured minimum (work.delegateEffort, default medium). Routine work stays cheap; risky work automatically thinks harder.
| What the batch changes | Effort |
|---|---|
| Ordinary CRUD, small refactor, docs | medium |
| Auth, sessions, payments, DB migrations, external APIs, retry/fallback logic | high |
| Architectural or cross-cutting changes | xhigh |
The effort enum is none < low < medium < high < xhigh — a per-batch pick below your floor is raised to it, a pick at or above is kept. The model defaults to gpt-5.5 (work.delegateModel).
The orchestration split
Section titled “The orchestration split”Codex only ever writes code. It is forbidden from git (commit / push / PRs) and scoped to the repo — and that is enforced, not just prompted:
- Git ownership — the worker captures the base commit and asserts
HEADis unchanged after everycodex exec. A sandbox can technically run git, so the HEAD assertion is the real guard, not the instruction. .flow/integrity — the worker snapshots non-scratch.flow/before delegating and restores any unauthorized write afterward. Codex may only touch its own.flow/tmp/codex-<task-id>/scratch dir.- Claude keeps plan-reading, the review loop, all commits, impl-review, and marking the task done.
A delegated task is split into ≤5 logical units at phase/file boundaries (never splitting a shared file across batches), and trivial one-line changes skip delegation entirely — not worth the round-trip.
Verify-before-complete contract
Section titled “Verify-before-complete contract”Every batch returns a structured result (an --output-schema JSON: status, files modified, issues, summary, verification). Codex must run the batch’s tests in one process and may not report completed unless they pass — it verifies and self-fixes first. The host then classifies each result deterministically:
| Codex result | Host action |
|---|---|
completed (after a cheap trust cross-check) | commit |
partial | keep the diff, finish + verify locally, then commit |
failed / malformed result | scoped rollback, count one strike |
| non-zero exit | rollback and disable delegation for the rest of the run |
Safety rails
Section titled “Safety rails”- One-time sandbox consent — the first delegated run asks once and persists the choice: yolo (default; full access incl. network, so it can install deps and run tests) or full-auto (
workspace-write, tighter blast radius, no network). - MCP isolation — launched with
--ignore-user-configso your Codex MCP servers don’t leak in (and don’t silently break the structured-output contract). - Recursion guard — won’t delegate if it is already running inside a Codex sandbox.
- Scoped rollback — undoes only the failed batch’s files; never a bare
git clean, and never touches.flow/plan state. - Host-owned 3-strike circuit breaker — three consecutive failed tasks and delegation switches off for the rest of the run, falling back to standard in-session mode. The counter lives in the host loop (each task is a fresh worker subagent), so it actually persists across tasks.
Ralph and attribution
Section titled “Ralph and attribution”Delegation runs in interactive and autonomous (Ralph) mode — headless only needs consent pre-granted in config, and a ralph-guard hook allowlists the exact canonical invocation. Commits carry mixed-model attribution trailers, so the code Codex wrote is recorded as Codex’s.
Configuration
Section titled “Configuration”| Key | Default | Meaning |
|---|---|---|
work.delegate | local (off) | set to codex to turn delegation on |
work.delegateModel | gpt-5.5 | the Codex model to drive |
work.delegateEffort | medium | the reasoning-effort floor |
work.delegateSandbox | yolo | yolo or full-auto |
work.delegateDecision | ask | ask per task, or auto to delegate eligible tasks unprompted |
See the flowctl reference for the full key list.
Next step
Section titled “Next step”/flow-next:make-pr <spec-id>