Skip to content

Why Flow-Next

AI agents drift when the work surface is a prompt plus a chat scrollback. They forget requirements, overfit to recent context, and produce diffs that are expensive to review.

Flow-Next fixes the operating model, not just the prompt.

AI makes code cheaper to produce. That shifts pressure to requirements, review, verification, and coordination. Traditional Agile touchpoints were designed for human-paced implementation, where unclear work could be corrected during a two-week cycle.

Agentic work does not have that safety valve. The spec must carry the weight.

Non-determinism makes precision non-optional

Section titled “Non-determinism makes precision non-optional”

LLMs are non-deterministic. Give an agent a vague or inaccurate description and you do not get one wrong result, you get a different wrong result every run. Precision is what makes the output stable: a reviewed, source-tagged spec that the agent rereads each time is how the same intent produces the same work twice, instead of a fresh guess.

Agents forget, and one model has blind spots

Section titled “Agents forget, and one model has blind spots”

No matter how good the model or the harness, agents drop things. They forget to update the docs you asked for, half-finish a task, or quietly skip a step. And every model family has blind spots in planning and implementation that same-family review tends to miss twice. Flow-Next answers both: enforced completion with receipts so nothing is silently dropped, and a different model reviewing each artifact so uncorrelated blind spots cancel.

flowchart TB
  Old["Human-paced delivery"] --> Loose["Loose ticket"]
  Loose --> Touchpoints["Standups, refinement, design review, PO chats"]
  Touchpoints --> Corrected["Corrected implementation"]

  New["Agent-paced delivery"] --> Precise["Precise spec"]
  Precise --> Fast["Fast implementation"]
  Fast --> Reviewable["Reviewable receipts and PR"]
  • Spec-driven work: every task belongs to a durable .flow/specs/<id>.md
  • Re-anchoring: every task starts by rereading spec, task, and git state
  • Fresh context: worker subagents avoid contamination from previous attempts
  • Cross-model review: a second model checks plan and implementation
  • Receipts: state transitions are backed by artifacts, not promises
  • PR-as-cognitive-aid: reviewers get acceptance coverage and focus paths

Cross-model review itself is not new; people have paired one model against another as a reviewer for a while. What Flow-Next did early was wire autonomous adversarial cross-model review into the loop: a different model challenges every plan and implementation automatically, at each handover and inside the autonomous modes (pilot, land, and Ralph), rather than as a manual pass someone has to remember to run. That is what lets you ship the loop’s output with confidence.

Before agents, a rough ticket could survive because the team finished the requirement during implementation. Daily conversation, pairing, Slack threads, design review, and ad-hoc product clarification filled in what the ticket missed.

When an agent can ship the task in one sitting, those touchpoints are gone. That does not make collaboration less important. It means collaboration has to move into explicit artifacts before the run starts:

Missing touchpointFlow-Next replacement
Refinement meeting/flow-next:interview --scope=business
Technical design chat/flow-next:interview --scope=technical --strategy --docs
Developer breakdown/flow-next:plan
Senior review of approach/flow-next:plan-review
Pairing and course correctionRe-anchored /flow-next:work tasks
Human pre-review/flow-next:impl-review
PR explanation/flow-next:make-pr

The artifact chain is not bureaucracy. It is the conversation that would otherwise be missing.

Flow-Next is not a hosted dashboard, SaaS tracker, Jira replacement, or CI runner. Everything lives in the repo under .flow/. Uninstall is rm -rf .flow/.

It also does not remove human ownership. Humans still own product judgment, risk tolerance, merge decisions, and production responsibility. Flow-Next makes those decisions easier to verify because the evidence is structured.