Skip to content

Plan Review

/flow-next:plan-review runs a structured adversarial review of a Flow-Next spec.

The host agent is the coordinator, not the reviewer. The actual critique comes from a separate backend — RepoPrompt, Codex CLI, or GitHub Copilot CLI — and is captured as a review receipt the spec can be re-checked against.

Run plan-review after /flow-next:plan lands the spec and its task breakdown, before /flow-next:work starts implementation. The reviewer sees both the spec narrative and the task graph — gaps surfaced now avoid wasted worker cycles later.

Specifically:

  • The spec touches risky surfaces (auth, migrations, public APIs) and you want a second pair of eyes that did not see the original conversation.
  • The spec is large enough that fixing it post-implementation would mean throwing away worker output.
  • You plan to hand the spec to autonomous work and want adversarial pressure on assumptions before Ralph runs unattended.

Skip it for trivial specs where the cost of review exceeds the cost of redoing one task.

Priority (first match wins):

  1. --review=rp|codex|copilot|export|none argument.
  2. FLOW_REVIEW_BACKEND environment variable.
  3. .flow/config.jsonreview.backend.
  4. Hard error if nothing is configured.
BackendNotes
rpRepoPrompt (macOS GUI). Builder auto-selects context. Primary backend.
codexCodex CLI. Cross-platform. Default model gpt-5.5; tunable via FLOW_CODEX_MODEL and FLOW_CODEX_EFFORT.
copilotGitHub Copilot CLI. Cross-platform — native Windows works from flow-next 1.1.9 via stdin delivery (see Windows note below). Supports Claude Opus / Sonnet / Haiku 4.5 and GPT-5.2 families.
exportWrite the review prompt to a file for paste into ChatGPT Pro or Claude web.
noneSkip review entirely.

The spec grammar backend[:model[:effort]] works in the env var, in .flow/config.json, and via --spec. Per-spec overrides are set with flowctl spec set-backend.

Native Windows works. run_copilot_exec delivers the prompt via stdin on Windows (subprocess.run(input=prompt, ...)), sidestepping the CreateProcessW 32,767-char argv cap that broke the -p path for spec-sized prompts in 1.1.8 and earlier. Session continuity is tracked via a touch marker because stdin-mode --resume is resume-only (unlike -p mode’s create-or-resume). Verified by a real-subprocess Windows CI smoke. Upstream tracking github/copilot-cli#3398 — once a first-class --prompt-file flag lands there, both POSIX and Windows paths will converge.

Every backend gets the same prompt: a Carmack-level adversarial review. The reviewer looks for:

  • Acceptance criteria that cannot be observed.
  • Implicit assumptions the spec depends on.
  • Out-of-scope creep the plan invited.
  • Missing edge cases or failure modes.
  • Architecture mismatches with the existing codebase.

Receipts persist under .flow/review-receipts/. Re-reviews stay in the same chat session so the reviewer can see what changed since the first pass. The skill loops fix → review until the backend returns SHIP.

  • Not a code review — that is /flow-next:impl-review.
  • Not a spec-completion check — that is /flow-next:spec-completion-review.
  • Not a sanity check on grammar or formatting — the reviewer is asked to challenge substance, not polish.

Address the findings in the spec, then either:

Terminal window
/flow-next:plan-review <spec-id> # re-review until SHIP
/flow-next:work <spec-id> # implementation