Skip to content

Architecture

Flow-Next is deliberately split between agent-native skills and deterministic flowctl plumbing.

Skills perform work that needs codebase reading, judgment, sequencing, and user clarification:

  • /flow-next:capture
  • /flow-next:interview
  • /flow-next:plan
  • /flow-next:work
  • /flow-next:impl-review
  • /flow-next:make-pr

The host agent is the intelligence. Flow-Next does not spawn a second LLM from flowctl just to make a judgment the current agent can make directly.

flowctl is pure Python plumbing for operations that need determinism:

  • Atomic writes to .flow/
  • Schema validation
  • Task and spec state transitions
  • Review receipt validation
  • Git plumbing
  • Migration helpers
  • Backend dispatch wrappers
.flow/
├── specs/
├── tasks/
├── memory/
├── review-receipts/
├── bin/
└── usage.md

This keeps the install non-invasive and the failure mode obvious.

flowchart TB
  Human["Human intent"] --> Skill["/flow-next skill"]
  Skill --> Agent["Host agent judgment"]
  Skill --> Flowctl["flowctl deterministic writes"]
  Agent --> Repo["Codebase"]
  Flowctl --> State[".flow state"]
  Repo --> Review["Review backend"]
  State --> Review
  Review --> Receipt["Receipt"]

The host agent reads code, asks clarifying questions, judges tradeoffs, and edits files. flowctl creates IDs, validates state, writes JSON and markdown atomically, and records transitions. The review backend supplies independent pressure so the same model that wrote the change is not the only reviewer.

Anything that requires judgment stays in a skill:

  • Does this spec satisfy the product request?
  • Which code paths are relevant?
  • Is a finding introduced by this diff or pre-existing?
  • What should the PR reviewer read first?

Anything that should be deterministic stays in flowctl:

  • Allocate the next spec ID.
  • Mark a task started or done.
  • Validate dependencies.
  • Emit machine-readable receipt JSON.
  • Migrate repo-local .flow/ state.

This keeps Flow-Next portable across harnesses. Claude Code, Codex, and Droid can all run the same workflow because the intelligence is the current host agent, not a hidden service.

HarnessPrimary role
Claude CodeCanonical plugin surface and slash-command workflow
OpenAI CodexCodex mirror with equivalent skills and subagent dispatch
Factory DroidCross-platform agent runtime support
RepoPromptHigh-context review and external model review workflows

The docs use slash commands because that is the user-facing workflow. The CLI reference exists for lower-level automation and debugging.

Canonical Flow-Next skills use Claude Code’s AskUserQuestion primitive for blocking decisions. The Codex mirror does not call request_user_input, because that tool is unavailable outside Codex Plan mode.

As of Flow-Next 1.1.2, sync-codex.sh rewrites canonical AskUserQuestion invocations into plain-text numbered prompts for Codex:

1. Recommended option
2. Alternative option
3. Other — type your own answer

That gives Codex Desktop Default mode, Codex Plan mode, and Codex CLI the same behavior without runtime mode detection. Claude Code and Factory Droid keep their native blocking-question surfaces.